Skip to content

Add HuggingFace embedder provider for local and cloud embeddings #1355

@ericevans-nv

Description

@ericevans-nv

Is this a new feature, an improvement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Medium

Please provide a clear description of problem this feature solves

A HuggingFace embedder provider will enable access to open-source embedding models from the HuggingFace Hub. This expands the toolkit's embedding capabilities by supporting local embedding generation without external API dependencies, self-hosted deployments via Text Embeddings Inference servers, and access to state-of-the-art models like BGE, E5, and GTE for semantic search.

Describe your ideal solution

Create a huggingface embedder provider type that supports both local Sentence Transformers execution and remote embeddings via TEI servers or the Inference API. The provider should detect the mode based on configuration: when endpoint_url is set, use huggingface_hub.InferenceClient for HTTP-based embedding requests; otherwise, load models locally using sentence-transformers with device placement configuration. A LangChain-compatible embedder client should be registered in the nvidia_nat_langchain plugin.

Config:

class HuggingFaceEmbedderConfig(EmbedderBaseConfig, name="huggingface"):
    model_name: str
    device: str = "auto"
    normalize_embeddings: bool = True
    api_key: OptionalSecretStr = None
    endpoint_url: str | None = None

Provider:

@register_embedder_provider(config_type=HuggingFaceEmbedderConfig)
async def huggingface_embedder_provider(config, builder):
    yield EmbedderProviderInfo(config=config, description=f"HuggingFace Embedder: {config.model_name}")

Client:

class HuggingFaceEmbedder:
    def __init__(self, config: HuggingFaceEmbedderConfig):
        self._config = config
        if config.endpoint_url:
            from huggingface_hub import InferenceClient
            self._client = InferenceClient(token=config.api_key, base_url=config.endpoint_url)
        else:
            from sentence_transformers import SentenceTransformer
            self._model = SentenceTransformer(config.model_name, device=config.device)

    def embed_documents(self, texts: list[str]) -> list[list[float]]:
        if self._config.endpoint_url:
            return self._client.feature_extraction(texts, model=self._config.model_name)
        else:
            return self._model.encode(texts, normalize_embeddings=self._config.normalize_embeddings)

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
  • I have searched the open feature requests and have found no duplicates for this feature request

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions