Create pattern to support multiple output formats#10
Create pattern to support multiple output formats#10diehlbw merged 12 commits intoepic-open-source:mainfrom
Conversation
instruments/pdsqi_9/pdsqi_prompt.py
Outdated
| DETAIL_INSTRUCTIONS = { | ||
| 1: "- Your output must be JSON-formatted, where each key is one of your RUBRIC_SET items (e.g., \"Citation\") and each corresponding value is another dictionary of two key-value pairs: \"explanation\" is a free text explanation of why your chosen GRADE is the correct grade, and \"grade\" is a single integer representing your respective GRADE that best matches the CLINICAL_SUMMARY for the key's metric.", | ||
| 3: "", | ||
| 6: '- Your output must ba VALID JSON-formatted string as follows:\n\"{"citation": {"explanation": "Your explanation here", "grade": 1}, "accurate": {"explanation": "Your explanation here", "grade": 1}, ...}\"' |
|
Also, just a general question. Is the python 3.10 requirement something that could be discussed to lower to 3.9? HELM uses python 3.9 so it conflicts with evaluation-instruments. This is also in my meeting agenda with the HELM team for tomorrow. |
Can you write this up as an issue, it would not be done in this PR. And I'd like to get reasoning discoverable there.
|
be80924 to
1d19e6d
Compare
After talking to the HELM team, this won't be necessary. HELM supports python 3.10 and 3.11. |
| from typing import Any | ||
| import evaluation_instruments.prep as prep | ||
|
|
||
| OUTPUT_MODE = "score_only" # Default output mode |
There was a problem hiding this comment.
I think it makes sense to use be consistent with how the enum is used and explained to the user. Here we set it to the string value but then the resolve_prompt function asks for one of the enum names, not the string values.
There was a problem hiding this comment.
Changed everything to Enum
Overview
For integrtation and reuse by external tools, such as medHELM, it is useful to have an accessible resolve_prompt that has instruction sets for both "score-only" (like PDSQI-9) and "score + explanation" (like Summary of Care).
Further the expectation of these being nested dictionaries {"grade": int, "explanation": str} is more aligned with their existing expectations.
Description of changes
Author Checklist
changelog/ISSUE.TYPE.rstfiles; see changelog/README.md.