Add HuggingFace embedder provider for local and cloud embeddings

### Is this a new feature, an improvement, or a change to existing functionality?

New Feature

### How would you describe the priority of this feature request

Medium

### Please provide a clear description of problem this feature solves

A HuggingFace embedder provider will enable access to open-source embedding models from the HuggingFace Hub. This expands the toolkit's embedding capabilities by supporting local embedding generation without external API dependencies, self-hosted deployments via Text Embeddings Inference servers, and access to state-of-the-art models like BGE, E5, and GTE for semantic search.

### Describe your ideal solution

Create a huggingface embedder provider type that supports both local Sentence Transformers execution and remote embeddings via TEI servers or the Inference API. The provider should detect the mode based on configuration: when endpoint_url is set, use huggingface_hub.InferenceClient for HTTP-based embedding requests; otherwise, load models locally using sentence-transformers with device placement configuration. A LangChain-compatible embedder client should be registered in the nvidia_nat_langchain plugin.

Config:
```python
class HuggingFaceEmbedderConfig(EmbedderBaseConfig, name="huggingface"):
    model_name: str
    device: str = "auto"
    normalize_embeddings: bool = True
    api_key: OptionalSecretStr = None
    endpoint_url: str | None = None
```

Provider:
```python
@register_embedder_provider(config_type=HuggingFaceEmbedderConfig)
async def huggingface_embedder_provider(config, builder):
    yield EmbedderProviderInfo(config=config, description=f"HuggingFace Embedder: {config.model_name}")
```

Client:
```python
class HuggingFaceEmbedder:
    def __init__(self, config: HuggingFaceEmbedderConfig):
        self._config = config
        if config.endpoint_url:
            from huggingface_hub import InferenceClient
            self._client = InferenceClient(token=config.api_key, base_url=config.endpoint_url)
        else:
            from sentence_transformers import SentenceTransformer
            self._model = SentenceTransformer(config.model_name, device=config.device)

    def embed_documents(self, texts: list[str]) -> list[list[float]]:
        if self._config.endpoint_url:
            return self._client.feature_extraction(texts, model=self._config.model_name)
        else:
            return self._model.encode(texts, normalize_embeddings=self._config.normalize_embeddings)
```

### Additional context

_No response_

### Code of Conduct

- [x] I agree to follow this project's Code of Conduct
- [x] I have searched the [open feature requests](https://github.com/NVIDIA/NeMo-Agent-Toolkit/issues?q=is%3Aopen+is%3Aissue+label%3A%22feature+request%22%2Cimprovement%2Cenhancement) and have found no duplicates for this feature request

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HuggingFace embedder provider for local and cloud embeddings #1355

Is this a new feature, an improvement, or a change to existing functionality?

How would you describe the priority of this feature request

Please provide a clear description of problem this feature solves

Describe your ideal solution

Additional context

Code of Conduct

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add HuggingFace embedder provider for local and cloud embeddings #1355

Description

Is this a new feature, an improvement, or a change to existing functionality?

How would you describe the priority of this feature request

Please provide a clear description of problem this feature solves

Describe your ideal solution

Additional context

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions