-
Notifications
You must be signed in to change notification settings - Fork 500
Description
Is this a new feature, an improvement, or a change to existing functionality?
New Feature
How would you describe the priority of this feature request
Medium
Please provide a clear description of problem this feature solves
A HuggingFace embedder provider will enable access to open-source embedding models from the HuggingFace Hub. This expands the toolkit's embedding capabilities by supporting local embedding generation without external API dependencies, self-hosted deployments via Text Embeddings Inference servers, and access to state-of-the-art models like BGE, E5, and GTE for semantic search.
Describe your ideal solution
Create a huggingface embedder provider type that supports both local Sentence Transformers execution and remote embeddings via TEI servers or the Inference API. The provider should detect the mode based on configuration: when endpoint_url is set, use huggingface_hub.InferenceClient for HTTP-based embedding requests; otherwise, load models locally using sentence-transformers with device placement configuration. A LangChain-compatible embedder client should be registered in the nvidia_nat_langchain plugin.
Config:
class HuggingFaceEmbedderConfig(EmbedderBaseConfig, name="huggingface"):
model_name: str
device: str = "auto"
normalize_embeddings: bool = True
api_key: OptionalSecretStr = None
endpoint_url: str | None = NoneProvider:
@register_embedder_provider(config_type=HuggingFaceEmbedderConfig)
async def huggingface_embedder_provider(config, builder):
yield EmbedderProviderInfo(config=config, description=f"HuggingFace Embedder: {config.model_name}")Client:
class HuggingFaceEmbedder:
def __init__(self, config: HuggingFaceEmbedderConfig):
self._config = config
if config.endpoint_url:
from huggingface_hub import InferenceClient
self._client = InferenceClient(token=config.api_key, base_url=config.endpoint_url)
else:
from sentence_transformers import SentenceTransformer
self._model = SentenceTransformer(config.model_name, device=config.device)
def embed_documents(self, texts: list[str]) -> list[list[float]]:
if self._config.endpoint_url:
return self._client.feature_extraction(texts, model=self._config.model_name)
else:
return self._model.encode(texts, normalize_embeddings=self._config.normalize_embeddings)Additional context
No response
Code of Conduct
- I agree to follow this project's Code of Conduct
- I have searched the open feature requests and have found no duplicates for this feature request