-
Notifications
You must be signed in to change notification settings - Fork 45
Description
Is your feature request related to a problem? Please describe.
Currently, the AI provider abstraction supports capability-based routing and health-aware fallback at the provider level. However, all models exposed by a provider are implicitly treated as a single execution target.
This creates several limitations:
Users cannot explicitly choose a provider + model combination
There is no structured way to expose provider-specific models (e.g., Gemini Pro vs Gemini Flash, local models, etc.)
Capability matching operates only at the provider level, not at the model level
Future UI flows (cloud/local selection, cost/performance trade-offs) are difficult to support cleanly
As the system grows to include multiple providers and multiple models per provider, this will limit flexibility and user control.
Describe the solution you'd like
Introduce a lightweight Provider–Model Selection Layer that builds on the existing AIProviderRegistry.
Key ideas:
Each provider can expose a list of models with metadata:
model name / id
capabilities (e.g., multimodal, fast_inference)
execution context (cloud / local)
The registry can resolve requests in the order:
user-selected provider
user-selected model (optional)
capability-based matching
health-aware fallback (already implemented)
This keeps routing logic centralized and avoids leaking provider-specific logic into higher layers
This design keeps backward compatibility while unlocking more advanced routing and UX flows.
Describe alternatives you've considered
Hard-coding model selection logic inside each provider
→ rejected, as it fragments routing logic and reduces extensibility
Keeping model choice entirely in the UI
→ rejected, as it duplicates backend capability and health logic
A centralized model-aware registry provides a cleaner, more scalable architecture.
Additional context
This issue is a natural extension of:
the AI provider abstraction layer
provider health tracking and safe fallback routing
It prepares the system for:
multiple models per provider
future cost / performance routing
explicit user choice without breaking existing flows