-
Notifications
You must be signed in to change notification settings - Fork 672
fix: Resolve evaluator naming issues #10877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: version-13
Are you sure you want to change the base?
Conversation
| if "foreign" in str(e).lower(): | ||
| raise BadRequest(f"Dataset with id {dataset_id} not found") | ||
| raise BadRequest(f"Evaluator with name {input.name} already exists") | ||
| raise BadRequest( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth renaming this resolve to create_dataset_code_evaluator while we're here.
|
|
||
| input_mappings_by_evaluator_node_id = { | ||
| evaluator.id: evaluator.input_mapping for evaluator in input.evaluators | ||
| evaluator_info_by_node_id = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| evaluator_info_by_node_id = { | |
| evaluator_input_by_node_id = { |
Suggestion
| "evaluators", | ||
| sa.Column("id", _Integer, primary_key=True), | ||
| sa.Column("name", sa.String, nullable=False, unique=True), | ||
| sa.Column("name", sa.String, nullable=False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may eventually want to display global evaluators in an evaluators hub and have them be addressable by name. Rather than relaxing the unique constraint, I would vote to either remove the column entirely for now or generate unique names like we do for the prompt underlying LLM evaluators, e.g., with a hashed suffix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think addressing evaluators globally by name should carry the expectation that the name is unique. You may see multiple "contains" evaluators, and then need to disambiguate them by attached dataset, description, etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, even if we show global evaluators, autogenerated names would be even less informative, in my opinion
| @strawberry.input | ||
| class PlaygroundEvaluatorInput: | ||
| id: GlobalID | ||
| display_name: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the id above the DatasetEvaluatorId? If so, don't we already have the display name saved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not, it's an EvaluatorID which unifies the preview payloads as well
resolves #10456
Note
Aligns evaluator naming across schema, client, and server, using
displayNamefor emitted annotations and errors, and relaxes DB uniqueness on evaluator names.displayNametoPlaygroundEvaluatorInput; update generated Relay types for mutations/subscriptionsdisplayNamewith evaluator mappings via newEvaluatorMappingEntry; updategetChatCompletionOverDatasetInputand table/section components to send and display itdisplay_namefor annotationnameand error chunks in chat mutations/subscriptions; pass through to evaluation union responsesevaluators.name(migration and ORM model); adjust duplicate error messages to dataset-leveldisplayNameuniqueness for dataset evaluatorsWritten by Cursor Bugbot for commit 97c47db. This will update automatically on new commits. Configure here.