FastOMOP Evaluation and Monitoring
This project is a work in progress.
foem (FastOMOP Evaluation and Monitoring) is a Python toolkit for evaluating and monitoring OMOP (Observational Medical Outcomes Partnership) databases. It automates SQL test generation, template-based query construction, and result aggregation for clinical data analysis.
- Automated SQL test generation for OMOP databases
- Template-based query construction for common clinical questions
- PostgreSQL connection management
- Outputs results in JSON format
- Easily extensible with new tests and templates
- Python 3.8+
- Database: PostgreSQL or Databricks SQL Warehouse
- uv (recommended) or pip
# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Sync dependencies and create virtual environment
uv sync# Create & activate a virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install dependencies
pip install -e .Create a .env file at the project root with your database configuration:
# Copy the sample configuration file
cp sample.env .env
# Edit .env with your database credentialsDB_TYPE=postgresql
DB_CONNECTION_STRING=postgresql://USER:PASSWORD@HOST:PORT/DBNAMEDB_TYPE=databricks
DATABRICKS_SERVER_HOSTNAME=your-workspace.cloud.databricks.com
DATABRICKS_HTTP_PATH=/sql/1.0/warehouses/abc123def456
DATABRICKS_ACCESS_TOKEN=your_personal_access_token
DATABRICKS_CATALOG=main
DATABRICKS_SCHEMA=defaultNote: If DB_TYPE is not set, PostgreSQL is used by default. See sample.env for detailed configuration instructions.
Run the predefined SQL tests and write results to dataset.json:
# With uv
uv run python main.py
# Or if using pip/venv
python main.pyConvert the JSON results to CSV format using the csv_export script:
# Export using execution_result field (default)
uv run python script/csv_export.py
# Export using expected_output field
uv run python script/csv_export.py --type expected
# Specify custom input/output paths
uv run python script/csv_export.py --input path/to/input.json --output path/to/output.csvOptions:
--type— Export type:execution(uses execution_result) orexpected(uses expected_output). Default:execution--input— Path to input JSON file. Default:output/dataset.json--output— Path to output CSV file. Default:output/dataset.csv
The exported CSV contains three columns: id, input, and expected_output. Single values are extracted directly, while multiple rows/columns are stored as JSON strings.
Extend/customize:
- Add/modify tests in src/foem/sql_test.py
- Add/modify query templates in the dataset/ directory
foem/
├── src/foem/ # Core package modules
│ ├── __init__.py # Package initialization
│ ├── config.py # Database connection setup
│ └── sql_test.py # SQL test logic and database interaction
├── script/ # Utility scripts
│ ├── csv_export.py # Export results to CSV
│ ├── langfuse_load_data.py # Load data to Langfuse
│ └── compare_results_llm.py # LLM-based result comparison
├── dataset/ # SQL query templates and test data
├── output/ # Generated output files
├── main.py # Entry point for running tests
├── sample.env # Sample environment configuration
└── pyproject.toml # Project configuration and dependencies
See LICENSE for details.