Skip to content

Context-Free Grammar-guided Generation of FHIR Resources Using Large Language Models

Notifications You must be signed in to change notification settings

j-frei/CFG4FHIR

Repository files navigation

Context-free Grammar for FHIR

Context-free grammars (CFGs) are a crucial element to streamline the use of Large Language Models (LLMs) for generating structured data using constrained decoding and structured generation. Manually creating such CFGs can be difficult, especially when the enforced language rule follows a complex and deeply nested document structure. HL7 FHIR is a powerful standard for encoding medical and clinical information. This work focusses on dynamically generatating CFGs which can be used enforce and stabilize the LLM outputs to closely follow the FHIR structure.

To compare the CFG-guided approach with JSON schema-guided or unguided implementations, a set of test cases were created to highlight potential differences in the resulting FHIR data.

Results

Grammar Synthesis

Install the dependencies first.

Setup Steps
# Create venv [using System Python 3.12 (<3.13!)]
# e.g. via installed UV for custom Python version:
uv python install 3.12
uv venv --python 3.12 env

source env/bin/activate
python -m ensurepip --upgrade

# Install dependencies
python3 -m pip install -r requirements.txt

You also need to install the tool jq, using

sudo apt install jq

You can generate custom FHIR CFG as follows:

./generate_FHIR_grammar.sh

You can test the grammar as follows:

export CUDA_VISIBLE_DEVICES=0
source env/bin/activate

# Using CFG
python3 demo_outlines_v1_cfg.py
# Using JSON schema
python3 demo_outlines_v1_jsonschema.py

Test Cases

The prompts and individual results of the test cases can be found in testcases/. Our test case summary is as follows:

ID Category Description CFG-guided JSON schema-guided Unguided
1.0 Output Cleanliness Basic Patient Test 🟢 🟢 🟢
2.1 Version Compliance MedicationStatement with CodeableReference 🟢 🟢 🔴: medication.reference set to str
2.2 Version Compliance MedicationStatement with basic medication.concept and medication.subject 🟢 🟢 🔴: R4 issue
2.3 Version Compliance MedicationStatement with multiple coding systems 🟢 🟢 🔴: R4 issue; invalid medication.coding
3.1 Structural Validity Condition Resource Generation 🟢 🔴: Patient-like object generated 🟢
3.2 Structural Validity Patient with telecom/address field 🟢 🔴: Patient-like object with empty array 🟢
3.3 Structural Validity MedicationStatement with encounter & dosage 🟢 🟢 🔴: R4 issue
3.4 Structural Validity MedicationStatement with structured Dosage information 🟢 🔴: Patient-like object 🔴: R4 issue; duration as object
3.5 Structural Validity MedicationStatement with multiple Dosages 🟢 🟢 🔴: R4 issue; doseQuantity in dosage item
4.1 Constrained Values MedicationStatement with “stored“ status 🟡: syntactically valid, semantically wrong: status: entered-in-error 🔴: Patient-like object 🔴: R4 issue; syntactically wrong: status: completed
4.2 Constrained Values MedicationStatement with “erroneous“ status 🟢 🔴: status correct, but hallucinated extra information 🔴: R4 issue
4.3 Constrained Values MedicationStatement with “completed“ status and “hours“, “days“ dosage values 🟢: status set to “draft“; “hours“ → “h“, “days“ → “d“ 🔴: Set invalid values “completed“, “hours“, “days“ 🔴: R4 issue; set invalid values “completed“, “hours“, “days“; duration as object
4.4 Constrained Values MedicationStatement with mixed valid/invalid UCUM codes 🟡: syntactically valid, semantically wrong: “week“ → “d“ 🔴: missed duration/durationUnit entirely, hallucinated when, asNeeded 🔴: R4 issue; duration as object
5.1 Schema Robustness MedicationStatement with non-standard order (status last) 🟢 🔴: Missing “status“, added “adherence“ 🔴: R4 issue
5.2 Schema Robustness MedicationStatement with non-standard order (status last; dosage before medication) 🟢 🟢 🔴: R4 issue; hallucinated fields: id, meta
5.3 Schema Robustness Deliberate Trigger: MedicationStatement with forbidden note field 🔴: Repetition loop, unable to generate “note“ field by CFG 🔴: Patient-like object 🔴: R4 issue
5.4 Schema Robustness Deliberate Trigger: MedicationStatement with forbidden dosage text field 🔴: Encoded textual information as structured information correctly instead (→hallucination) 🔴: Patient-like object 🔴: R4 issue

To reproduce our test results, ensure the following prerequisits:

  • The dependencies and virtual environment are installed.
  • The grammar is generated (with unchanged modifications).
  • You have access to the Llama 3.3 70B-Instruct model via Hugging Face.

Perform the following steps:

# Remove the old result data
rm testcases/*_collected.txt testcases/*_output.*.txt

# Setup envs
source env/bin/activate
export HF_TOKEN="<your HF token>"
export CUDA_VISIBLE_DEVICES=0

# Run the experiments again
python3 run_testcases.py