Caution
This project is currently under active testing and development. It is a Python implementation of the OHDSI CIRCE-BE Java library. While we aim for 1:1 parity, this version is an Alpha release and should be used with caution in production environments.
A Python implementation of the OHDSI CIRCE-BE (Cohort Inclusion and Restriction Criteria Engine) for generating SQL queries from cohort definitions in the OMOP Common Data Model.
CIRCE Python provides a comprehensive toolkit for working with OMOP CDM cohort definitions:
- Cohort Definition Modeling: Create and validate cohort expressions using Pydantic models
- SQL Generation: Generate SQL queries from cohort definitions for OMOP CDM v5.x
- Concept Set Management: Handle concepts and concept sets from OMOP vocabularies
- Validation & Checking: Comprehensive validation with 40+ checker implementations
- Print-Friendly Output: Generate human-readable markdown descriptions of cohort definitions
- CLI Interface: Command-line tools for validation, SQL generation, and markdown rendering
Important
This package is currently in Alpha status and undergoing rigorous parity testing against the Java implementation.
- Version: 0.1.0 (Alpha)
- Tests: 3,400+ passing
- Coverage: 34% (Core logic focus)
- Python: 3.8+
- License: Apache 2.0
Note
This package is currently in private development. Install from source using Git.
# Clone the repository
git clone https://github.com/OHDSI/ohdsi-circepy.git
cd Circepy
# Install in development mode with all dependencies
pip install -e ".[dev]"
# Verify installation
circe --helpSee INSTALLATION.md for detailed installation instructions, troubleshooting, and setup options.
# Coming in future release pip install ohdsi-circepy
The easiest way to use CIRCE is through the command-line interface:
# Validate a cohort expression JSON file
circe validate cohort.json
# Generate SQL from a cohort expression
circe generate-sql cohort.json --output cohort.sql
# Render a cohort expression to markdown
circe render-markdown cohort.json --output cohort.md
# Process a cohort expression (validate, generate SQL, and render markdown)
circe process cohort.json --validate --sql --markdownSee the CLI Documentation section below for more details.
from circe import CohortExpression
from circe.cohortdefinition import PrimaryCriteria, ConditionOccurrence
from circe.cohortdefinition.core import ObservationFilter, ResultLimit
from circe.vocabulary import ConceptSet, ConceptSetExpression, ConceptSetItem, Concept
# Create a cohort expression
cohort = CohortExpression(
title="Type 2 Diabetes Cohort",
primary_criteria=PrimaryCriteria(
criteria_list=[
ConditionOccurrence(
codeset_id=1,
first=True
)
],
observation_window=ObservationFilter(prior_days=0, post_days=0),
primary_limit=ResultLimit(type="All")
),
concept_sets=[
ConceptSet(
id=1,
name="Type 2 Diabetes",
expression=ConceptSetExpression(
items=[
ConceptSetItem(
concept=Concept(
concept_id=201826,
concept_name="Type 2 diabetes mellitus"
),
include_descendants=True
)
]
)
)
]
)
# Generate SQL using the API
from circe.api import build_cohort_query
from circe.cohortdefinition import BuildExpressionQueryOptions
options = BuildExpressionQueryOptions()
options.cdm_schema = 'cdm'
options.vocabulary_schema = 'cdm'
options.cohort_id = 1
options.target_table = 'scratch.cohort'
sql = build_cohort_query(cohort, options)
print(sql)This package provides a complete Python implementation of CIRCE-BE with:
- 3,400+ passing tests with focused coverage on core logic
- 18+ SQL builders for all OMOP CDM domains:
- Condition Occurrence/Era
- Drug Exposure/Era
- Procedure Occurrence
- Measurement, Observation
- Visit Occurrence/Detail
- Device Exposure, Specimen
- Death, Location Region
- Observation Period, Payer Plan Period
- And more...
- Full cohort expression validation with comprehensive error checking
- Markdown rendering for human-readable cohort descriptions
- Complete CLI interface with 4 commands (validate, generate-sql, render-markdown, process)
- Java interoperability - supports both camelCase and snake_case field names for seamless Java CIRCE-BE compatibility
This project maintains 1:1 compatibility with Java CIRCE-BE.
- All Python classes replicate Java functionality exactly
- Field names support both Java (camelCase) and Python (snake_case) formats
- SQL generation produces identical results to Java implementation
- All changes are validated against Java schema
See JAVA_CLASS_MAPPINGS.md for complete class mappings.
circe/
├── cohortdefinition/ # Core cohort definition classes
│ ├── builders/ # SQL query builders (18+ builders)
│ ├── printfriendly/ # Human-readable markdown output
│ └── negativecontrols/ # Negative control generation
├── vocabulary/ # Concept and concept set management
├── check/ # Validation and checking framework
│ ├── checkers/ # 40+ specific checker implementations
│ ├── operations/ # Check operations
│ ├── utils/ # Check utilities
│ └── warnings/ # Warning classes
├── helper/ # Utility helper classes
├── api.py # High-level API functions
└── cli.py # Command-line interface
- Complete cohort definition data model with Pydantic validation
- 18+ SQL builders covering all OMOP CDM domains
- Comprehensive CLI interface (validate, generate-sql, render-markdown, process)
- Java interoperability with camelCase/snake_case field support
- Cohort expression validation with 40+ checker implementations
- Markdown rendering for print-friendly descriptions
- Full test suite (3,400+ tests)
- Type hints throughout with py.typed marker
- Concept set expression handling
- Window criteria and correlated criteria support
- Date adjustments and custom era strategies
- Observation period and demographic criteria
- Inclusion rules and censoring criteria
CIRCE provides a comprehensive command-line interface for validating, generating SQL, and rendering cohort expressions.
Validate a cohort expression JSON file against the CIRCE standard:
circe validate cohort.jsonOptions:
--verbose, -v: Display all validation warnings including INFO level--quiet, -q: Suppress non-error output
Exit codes:
0: Valid (no errors or warnings)1: Invalid (errors found)2: Valid but has warnings
Generate SQL from a cohort expression:
# Output to stdout
circe generate-sql cohort.json
# Output to file
circe generate-sql cohort.json --output cohort.sql
# With custom schema names
circe generate-sql cohort.json --cdm-schema my_cdm --vocab-schema my_vocab --cohort-id 123Options:
--output, -o: Output SQL file path (default: stdout)--sql-options: JSON file with BuildExpressionQueryOptions--cdm-schema: CDM schema name (default:@cdm_database_schema)--vocab-schema: Vocabulary schema name (default:@vocabulary_database_schema)--cohort-id: Cohort ID for SQL generation--validate: Validate before generating SQL (default: True)--no-validate: Skip validation before generating SQL--verbose, -v: Verbose output--quiet, -q: Suppress non-error output
Render a cohort expression to human-readable markdown:
# Output to stdout
circe render-markdown cohort.json
# Output to file
circe render-markdown cohort.json --output cohort.mdOptions:
--output, -o: Output markdown file path (default: stdout)--validate: Validate before rendering markdown (default: True)--no-validate: Skip validation before rendering markdown--verbose, -v: Verbose output--quiet, -q: Suppress non-error output
Process a cohort expression with multiple operations:
# Validate, generate SQL, and render markdown
circe process cohort.json --validate --sql --markdown
# Generate SQL with custom output file
circe process cohort.json --sql output.sql
# Generate SQL and markdown with default file names
circe process cohort.json --sql --markdownOptions:
--validate: Validate the cohort expression--sql [FILE]: Generate SQL (optionally specify output file, default: input file with .sql extension)--markdown [FILE]: Render markdown (optionally specify output file, default: input file with .md extension)--sql-options: JSON file with BuildExpressionQueryOptions--cdm-schema: CDM schema name (default:@cdm_database_schema)--vocab-schema: Vocabulary schema name (default:@vocabulary_database_schema)--cohort-id: Cohort ID for SQL generation--verbose, -v: Verbose output--quiet, -q: Suppress non-error output
# Validate a cohort expression
circe validate my_cohort.json
# Generate SQL with custom schema
circe generate-sql my_cohort.json --output my_cohort.sql \
--cdm-schema my_cdm_schema \
--vocab-schema my_vocab_schema \
--cohort-id 1
# Generate SQL and markdown in one command
circe process my_cohort.json --sql --markdown
# Validate, generate SQL, and render markdown
circe process my_cohort.json --validate --sql my_cohort.sql --markdown my_cohort.md# Clone the repository
git clone https://github.com/OHDSI/Circepy.git
cd Circepy
# Install with development dependencies
pip install -e ".[dev]"
# Verify installation
pytest --version
circe --helppytestAll 3,400+ tests should pass.
black circe/
isort circe/mypy circe/This implementation is designed to be compatible with OHDSI CIRCE-BE Java version. The Python package:
- Accepts JSON cohort definitions from OHDSI Atlas and other tools
- Generates SQL identical to the Java implementation
- Supports all OMOP CDM v5.x versions
- Maintains field name compatibility (camelCase and snake_case)
If you encounter import errors, ensure the package is properly installed:
pip install --upgrade ohdsi-circepy- Verify your cohort expression JSON is valid using
circe validate - Check that all concept IDs reference valid OMOP concepts
- Ensure schema names are correctly specified
For large cohort definitions with many criteria:
- SQL generation typically completes in < 1 second
- Validation runs in < 500ms for most cohorts
- Memory usage scales with the number of criteria (typically < 100MB)
Q: Is this compatible with OHDSI Atlas? A: Yes, this package can process cohort definition JSON files exported from Atlas.
Q: Can I use this with CDM v5.3? A: Yes, the package supports all OMOP CDM v5.x versions.
Q: How do I convert camelCase JSON to Python? A: The package automatically handles both camelCase and snake_case field names.
Q: Does this replace the Java CIRCE-BE? A: No, this is a complementary Python implementation. Both produce identical SQL output.
Contributions are welcome! Please see our Contributing Guidelines for details.
Key areas for contribution:
- Additional test coverage
- Performance optimizations
- Documentation improvements
- Bug fixes and issue reports
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
This project is based on the Java CIRCE-BE implementation by the OHDSI community. We thank all contributors to the original Java implementation.
Special thanks to:
- The OHDSI community for their continued support
- Contributors to the Java CIRCE-BE project
- The Pydantic team for their excellent validation library
- Repository: https://github.com/OHDSI/circepy
- Issues: https://github.com/OHDSI/circepy/issues
- Installation Guide: INSTALLATION.md
- PyPI: https://pypi.org/project/circepy/ (coming soon)
- Documentation: https://ohdsi-circepy.readthedocs.io/ (coming soon)
- OHDSI CIRCE-BE (Java) - Original Java implementation
- OHDSI Common Data Model - OMOP CDM specification
- OHDSI Atlas - Web-based cohort definition tool
- OHDSI WebAPI - RESTful API for OHDSI tools