Add dynamic model selection to CI workflow #8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add dynamic model selection to CI workflow
Summary
This PR modifies the CI workflow to dynamically select an available model from the server instead of using a hardcoded model name. The workflow now calls the
/v1/modelsAPI endpoint to fetch available models and automatically selects the first one in the list.Key Changes:
MODEL_NAME: lfm-3bwith dynamic model selection in both evaluation and judge stepsReview & Testing Checklist for Human
{"data":[{"id":"model-name",...}]}formatRecommended Test Plan:
Diagram
%%{ init : { "theme" : "default" }}%% graph TD subgraph "CI Workflow" A["Install dependencies"] B["Get Available Model<br/>(NEW STEP)"]:::major-edit C["Run API Evaluation Script"]:::minor-edit D["Run OpenAI Judge"]:::minor-edit end subgraph "External" E["MODEL_URL/v1/models<br/>API Endpoint"]:::context end A --> B B --> C C --> D B -.->|"curl + jq parsing"| E B -.->|"model_name output"| C B -.->|"model_name output"| D subgraph Legend L1[Major Edit]:::major-edit L2[Minor Edit]:::minor-edit L3[Context/No Edit]:::context end classDef major-edit fill:#90EE90 classDef minor-edit fill:#87CEEB classDef context fill:#FFFFFFNotes
Link to Devin run: https://app.devin.ai/sessions/92e0a8e247284979b4f8a658feef4c85
Requested by: @tuliren
Important: The judge model name remains hardcoded as "lfm-7b" - please confirm if this should also be dynamic or if it's intentionally static.
Testing limitation: This change couldn't be tested locally due to requiring live API credentials and endpoint access. The JSON parsing logic and error handling should be validated against the actual API response format.