Add noisy circuit dataset and documentation (Issue #12)#25
Conversation
- Add Stim circuits for rotated surface code (d=3) memory experiments with circuit-level depolarizing noise (p=0.01) at 3, 5, 7 rounds - Add generation script (scripts/generate_noisy_circuits.py) - Add comprehensive README with BP decoding tutorial and examples - Add visualization images (qubit layout, parity check matrix, syndrome stats) - Update .gitignore to exclude .venv/
- Convert scripts/ to src/bpdecoderplus/ package following Python best practices - Add pyproject.toml with uv/hatchling build system and dependencies - Add comprehensive test suite (32 tests) for circuit.py and cli.py - Update .gitignore with Python-specific patterns - Update README to use new CLI entry point via uv Addresses PR feedback from @GiggleLiu.
- Add Makefile with targets for install, setup, generate-dataset, test, and clean - Update pyproject.toml with uv dev-dependencies configuration - Addresses issue #12 requirements for automation and uv package management Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add test.yml workflow to run tests on push and PR - Test on Python 3.10, 3.11, and 3.12 - Use uv for dependency management in CI - Addresses PR #14 review comment for CI/CD setup Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update CI workflow to generate coverage reports with pytest-cov - Upload coverage to Codecov for tracking - Add test status and coverage badges to README - Add `make test-cov` target for local coverage reports - Update .gitignore to exclude coverage files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Set ignore-nothing-to-cache to true to allow CI to proceed when uv.lock is not present in the repository. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove enable-cache to avoid lock file requirement. Caching can be re-enabled later with a proper lock file. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Delete all PNG files (layout, parity check matrix, syndrome stats) - Update README to remove image references - Keep focus on circuit files and code examples Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add syndrome.py module for sampling and saving syndromes - Integrate syndrome generation into CLI with --generate-syndromes flag - Add comprehensive test suite for syndrome operations - Add make generate-syndromes target for easy database creation - Support npz format with metadata for efficient storage Features: - Sample detection events from circuits - Save/load syndrome databases with metadata - Generate databases directly from circuit files - CLI integration for automated workflow Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add dem.py module for DEM extraction and manipulation - Extract DEM from circuits with decomposition support - Save/load DEMs in stim native format - Convert DEM to JSON for analysis - Build parity check matrix H for BP decoding - Integrate DEM generation into CLI with --generate-dem flag - Add comprehensive test suite for DEM operations - Add make generate-dem target Features: - Extract detector error models from circuits - Save in .dem format (stim native) - Export to JSON with structured error information - Build parity check matrix for BP decoder - CLI integration for automated workflow Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Stim returns boolean arrays by default, not uint8. Update test to accept both bool and uint8 dtypes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add SYNDROME_DATASET.md with complete API documentation - Add validate_dataset.py for dataset generation and validation - Document data format, API interface, and validation checks - Include usage examples and statistics - Provide evidence of dataset validity Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add minimal_example.py with complete end-to-end demonstration - Add PIPELINE_ILLUSTRATION.md with visual pipeline diagrams - Include detailed explanations of each step - Show data flow and file formats - Provide conceptual understanding of the pipeline Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rewrote PIPELINE_ILLUSTRATION.md as a practical getting-started guide focused on data generation workflow. Added generate_demo_dataset.py to provide a working example that generates, validates, and saves a small syndrome dataset. These changes make it easier for new users to understand and use the package. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit reorganizes the dataset structure and ensures proper file placement for circuits, DEMs, and syndromes. Changes: - Reorganize datasets/ into circuits/, dems/, and syndromes/ subdirectories - Update CLI default output to datasets/circuits/ - Update DEM generation to save files in datasets/dems/ - Update syndrome generation to save files in datasets/syndromes/ - Fix test to reflect new default output path - Add demo DEM and syndrome files for all three circuit variants Resolves #4: Detector error model generation now saves .dem files Resolves #5: Syndrome database generation now saves .npz files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit adds support for generating UAI (Uncertainty in Artificial Intelligence) format files from detector error models, enabling probabilistic inference with tools like TensorInference.jl. Changes: - Add dem_to_uai() to convert DEM to UAI format - Add save_uai() to save UAI files - Add generate_uai_from_circuit() for CLI integration - Add --generate-uai flag to CLI - Generate UAI files for all demo circuits - Add comprehensive test coverage for UAI functionality The UAI format represents the DEM as a Markov network where: - Each detector is a binary variable - Each error mechanism is a factor/clique - Factor tables encode error probabilities Addresses #4 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit merges SYNDROME_DATASET.md and PIPELINE_ILLUSTRATION.md into a single comprehensive GETTING_STARTED.md guide in the examples folder. Changes: - Create examples/GETTING_STARTED.md with unified content - Add UAI format introduction for beginners - Update all file paths to reflect new dataset organization - Remove redundant datasets/SYNDROME_DATASET.md - Remove redundant examples/PIPELINE_ILLUSTRATION.md The new guide provides: - Quick start instructions - Step-by-step pipeline explanation - Detailed format documentation (.stim, .dem, .uai, .npz) - Code examples for all use cases - Troubleshooting and best practices Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit reorganizes the dataset structure to keep UAI files separate from DEM files for better organization. Changes: - Move .uai files from datasets/dems/ to datasets/uais/ - Update generate_uai_from_circuit() to save in datasets/uais/ - Update documentation to reflect new folder structure - Update datasets/README.md with dataset organization section - Update examples/GETTING_STARTED.md with correct paths Dataset structure: - datasets/circuits/ - Circuit files (.stim) - datasets/dems/ - Detector error models (.dem) - datasets/uais/ - UAI format files (.uai) - datasets/syndromes/ - Syndrome databases (.npz) All tests passing (62/62) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Looks great! please resolve the conflicts in this pr. |
GiggleLiu
left a comment
There was a problem hiding this comment.
Good try, impressive speed.
There was a problem hiding this comment.
instead of using markdown, please setup a documentation, and deploy it to GitHub pages.
- Move generate_demo_dataset.py to examples/ - Move validate_dataset.py to examples/ - Update GETTING_STARTED.md with clarifications This keeps the root directory clean and groups all example/demo code in a dedicated folder for better project organization. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
… syndrome dataset file
Resolved conflicts and integrated changes from main: - Updated codecov configuration with token and files parameter - Added torch dependency to pyproject.toml - Updated default output path to datasets/noisy_circuits - Preserved DEM and syndrome generation features from this branch - Removed .claude/settings.local.json from version control - Added .claude/settings.local.json to .gitignore - Added WIP note to README 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Move script files from examples/ to scripts/: - generate_demo_dataset.py - validate_dataset.py This separates demonstration scripts from API usage examples. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add mkdocs.yml configuration with Material theme - Create docs/index.md as main documentation page - Move GETTING_STARTED.md to docs/getting_started.md - Add GitHub Actions workflow for automatic deployment - Add docs dependencies to pyproject.toml - Add Makefile targets for building and serving docs Documentation will be available at GitHub Pages after merge to main. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
All review comments have been addressed:
Ready for final review! |
Summary
This PR implements comprehensive noisy circuit dataset generation with DEM extraction, UAI format conversion, and syndrome sampling capabilities. All documentation has been consolidated into a unified getting started guide, and the dataset is organized into well-structured subdirectories.
Issues Addressed
Issue #12: Noisy Circuit Dataset ✅
datasets/circuits/with three surface code circuitsIssue #4: Detector Error Model Generation ✅
extract_dem()to generate DEMs from circuitssave_dem()to write .dem files in stim formatgenerate_dem_from_circuit()for CLI integration--generate-demflag to CLIdatasets/dems/directorydem_to_uai()to convert DEM to UAI formatsave_uai()andgenerate_uai_from_circuit()--generate-uaiflag to CLIdatasets/uais/directorytests/test_dem.py(19 tests)Issue #5: Syndrome Database Generation ✅
sample_syndromes()to generate detection eventssave_syndrome_database()to write .npz filesgenerate_syndrome_database_from_circuit()for CLI integration--generate-syndromesflag to CLIdatasets/syndromes/directorytests/test_syndrome.pyDataset Organization
Well-structured dataset with separate subdirectories by file type:
Documentation
New unified guide:
examples/GETTING_STARTED.md- Comprehensive guide covering:Other documentation:
datasets/README.md- Dataset overview with organization structureexamples/minimal_example.py- Full pipeline demonstrationRemoved redundant files:
datasets/SYNDROME_DATASET.mdinto unified guideexamples/PIPELINE_ILLUSTRATION.mdinto unified guideTesting
✅ All 62 tests passing
✅ Circuit generation validated
✅ DEM extraction tested (.dem and .uai formats)
✅ Syndrome sampling tested
✅ CLI integration verified
✅ CI passing on Python 3.10, 3.11, 3.12
Usage Examples
UAI Format
The UAI format represents the DEM as a Markov network:
This enables:
Closes
Closes #12
Closes #4
Closes #5
🤖 Generated with Claude Code