Skip to content

fastomop/foem

Repository files navigation

foem

FastOMOP Evaluation and Monitoring

⚠️ Under development

This project is a work in progress.

Overview

foem (FastOMOP Evaluation and Monitoring) is a Python toolkit for evaluating and monitoring OMOP (Observational Medical Outcomes Partnership) databases. It automates SQL test generation, template-based query construction, and result aggregation for clinical data analysis.

Features

  • Automated SQL test generation for OMOP databases
  • Template-based query construction for common clinical questions
  • PostgreSQL connection management
  • Outputs results in JSON format
  • Easily extensible with new tests and templates

Requirements

  • Python 3.8+
  • Database: PostgreSQL or Databricks SQL Warehouse
  • uv (recommended) or pip

Installation

Using uv (recommended)

# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Sync dependencies and create virtual environment
uv sync

Using pip

# Create & activate a virtual environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install dependencies
pip install -e .

Configuration

Create a .env file at the project root with your database configuration:

# Copy the sample configuration file
cp sample.env .env

# Edit .env with your database credentials

PostgreSQL Configuration (default)

DB_TYPE=postgresql
DB_CONNECTION_STRING=postgresql://USER:PASSWORD@HOST:PORT/DBNAME

Databricks Configuration

DB_TYPE=databricks
DATABRICKS_SERVER_HOSTNAME=your-workspace.cloud.databricks.com
DATABRICKS_HTTP_PATH=/sql/1.0/warehouses/abc123def456
DATABRICKS_ACCESS_TOKEN=your_personal_access_token
DATABRICKS_CATALOG=main
DATABRICKS_SCHEMA=default

Note: If DB_TYPE is not set, PostgreSQL is used by default. See sample.env for detailed configuration instructions.

Usage

Running Tests

Run the predefined SQL tests and write results to dataset.json:

# With uv
uv run python main.py

# Or if using pip/venv
python main.py

Exporting to CSV

Convert the JSON results to CSV format using the csv_export script:

# Export using execution_result field (default)
uv run python script/csv_export.py

# Export using expected_output field
uv run python script/csv_export.py --type expected

# Specify custom input/output paths
uv run python script/csv_export.py --input path/to/input.json --output path/to/output.csv

Options:

  • --type — Export type: execution (uses execution_result) or expected (uses expected_output). Default: execution
  • --input — Path to input JSON file. Default: output/dataset.json
  • --output — Path to output CSV file. Default: output/dataset.csv

The exported CSV contains three columns: id, input, and expected_output. Single values are extracted directly, while multiple rows/columns are stored as JSON strings.

Customization

Extend/customize:

File Structure

foem/
├── src/foem/           # Core package modules
│   ├── __init__.py     # Package initialization
│   ├── config.py       # Database connection setup
│   └── sql_test.py     # SQL test logic and database interaction
├── script/             # Utility scripts
│   ├── csv_export.py           # Export results to CSV
│   ├── langfuse_load_data.py   # Load data to Langfuse
│   └── compare_results_llm.py  # LLM-based result comparison
├── dataset/            # SQL query templates and test data
├── output/             # Generated output files
├── main.py             # Entry point for running tests
├── sample.env          # Sample environment configuration
└── pyproject.toml      # Project configuration and dependencies

License

See LICENSE for details.

About

FastOMOP Evaluation and Monitoring

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages