A production-ready CLI framework for privacy-preserving machine learning benchmarking. Converts notebook-based experiments into a modular, reproducible command-line interface.
# Install UV (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install PrivacyBench in development mode
pip install -e .
# Verify installation
privacybench --help# List available experiments, datasets, models
privacybench list all
# Run CNN baseline experiment
privacybench run --experiment cnn_baseline --dataset alzheimer
# Run federated learning experiment
privacybench run --experiment fl_cnn --dataset alzheimer
# Validate configuration file
privacybench validate --config configs/experiments/baselines/cnn_alzheimer.yaml
# Dry run (validate without execution)
privacybench run --experiment vit_baseline --dataset skin_lesions --dry-runExecute privacy-preserving ML experiments
Options:
--experiment: Experiment type (cnn_baseline, vit_baseline, fl_cnn, dp_cnn, etc.)--dataset: Dataset to use (alzheimer, skin_lesions)--config: Custom YAML configuration file--output: Output directory for results (default: ./results)--dry-run: Validate configuration without execution--seed: Random seed for reproducibility (default: 42)
Examples:
privacybench run --experiment cnn_baseline --dataset alzheimer
privacybench run --experiment fl_dp_cnn --dataset skin_lesions --output ./my_results
privacybench run --config experiments/custom_config.yamlDisplay available components
Options:
experiments: List available experimentsdatasets: List available datasetsmodels: List available model architecturesprivacy: List available privacy techniquesall: List everything (default)
Examples:
privacybench list experiments
privacybench list datasets
privacybench list allValidate experiment configurations
Options:
--config: YAML configuration file to validate (required)--verbose: Show detailed validation information
Examples:
privacybench validate --config configs/experiments/baselines/cnn_alzheimer.yaml
privacybench validate --config my_experiment.yaml --verbose| Experiment | Model | Privacy | Dataset Support | Expected Accuracy |
|---|---|---|---|---|
cnn_baseline |
ResNet18 | None | alzheimer, skin_lesions | 97.9% (alzheimer) |
vit_baseline |
ViT-Base/16 | None | alzheimer, skin_lesions | 99.0% (alzheimer) |
fl_cnn |
ResNet18 | Federated Learning | alzheimer, skin_lesions | ~98.0% (alzheimer) |
fl_vit |
ViT-Base/16 | Federated Learning | alzheimer, skin_lesions | TBD |
dp_cnn |
ResNet18 | Differential Privacy | alzheimer, skin_lesions | TBD |
dp_vit |
ViT-Base/16 | Differential Privacy | alzheimer, skin_lesions | TBD |
fl_dp_cnn |
ResNet18 | FL + DP | alzheimer, skin_lesions | TBD |
smpc_cnn |
ResNet18 | SMPC | alzheimer, skin_lesions | TBD |
- Classes: 4 (NonDemented, VeryMildDemented, MildDemented, ModerateDemented)
- Size: ~6,400 images
- Type: Medical imaging
- Usage:
--dataset alzheimer
- Classes: 8 skin lesion types
- Size: ~10,000 images
- Type: Medical imaging
- Usage:
--dataset skin_lesions
- Standard training without privacy constraints
- Fastest training, best accuracy
- Use:
cnn_baseline,vit_baseline
- Distributed training without sharing raw data
- Moderate privacy, good utility
- Use:
fl_cnn,fl_vit
- Mathematical privacy guarantees via noise injection
- Strong privacy, reduced utility
- Use:
dp_cnn,dp_vit
- Cryptographic secure aggregation
- Very strong privacy, significant overhead
- Use:
smpc_cnn,smpc_vit
- FL + DP: Combined federated training with differential privacy
- FL + SMPC: Federated training with cryptographic security
- Use:
fl_dp_cnn,fl_smpc_cnn
privacybench/
βββ cli/ # CLI interface and commands
βββ core/ # Component registry and wrappers (Phase 2)
βββ execution/ # Experiment execution engine (Phase 3)
βββ output/ # Results collection and export (Phase 4)
βββ utils/ # Utilities and helpers
βββ legacy/ # Preserved existing code
βββ configs/ # YAML configuration files
βββ tests/ # Test suite (Phase 5)
βββ examples/ # Usage examples
metadata:
name: "cnn_baseline_alzheimer"
experiment_type: "cnn_baseline"
dataset: "alzheimer"
dataset:
name: "alzheimer"
config:
augmentation: true
test_split: 0.08
validation_split: 0.1
model:
architecture: "cnn"
config:
pretrained: true
num_classes: 4
dropout: 0.1
privacy:
techniques: [] # No privacy for baseline
training:
epochs: 50
batch_size: 32
learning_rate: 0.0002
optimizer: "adam"
tolerance: 7
output:
directory: "./results"
save_model: true
export_formats: ["json", "csv"]Federated Learning:
privacy:
techniques:
- name: "federated_learning"
config:
num_clients: 3
num_rounds: 5
strategy: "FedAvg"Differential Privacy:
privacy:
techniques:
- name: "differential_privacy"
config:
epsilon: 1.0
delta: 1e-5
noise_multiplier: 1.0
max_grad_norm: 1.0Combined FL + DP:
privacy:
techniques:
- name: "federated_learning"
config:
num_clients: 3
num_rounds: 5
- name: "differential_privacy"
config:
epsilon: 1.0
delta: 1e-5π EXPERIMENT COMPLETED SUCCESSFULLY
βββββββββββββββββββββββββββββββββββββββ
π Experiment: cnn_baseline_alzheimer
π Dataset: alzheimer
π§ Model: cnn
π Privacy: None (Baseline)
β±οΈ Duration: 588.0 seconds
π PERFORMANCE METRICS:
β’ Accuracy: 97.90%
β’ F1 Score: 0.9785
β’ ROC AUC: 0.9958
β‘ RESOURCE CONSUMPTION:
β’ Training Time: 588.0 seconds
β’ Peak GPU Memory: 1.20 GB
β’ Energy Consumed: 0.026000 kWh
β’ CO2 Emissions: 0.011830 kg
π Results saved to: ./results/exp_20250131_123456
results/exp_20250131_123456/
βββ results.json # Complete results
βββ metrics.csv # Key metrics table
βββ summary.md # Human-readable summary
βββ config.yaml # Experiment configuration
βββ emissions.csv # CodeCarbon energy data
Phase 1 (Current): β CLI Foundation & Configuration
- CLI entry point and command structure
- YAML configuration parsing and validation
- Integration with existing experiments.yaml
- Command:
privacybench list,privacybench validate,privacybench run --dry-run
Phase 2 (Next): π§ Component System & Wrappers
- Component registry for datasets, models, privacy techniques
- Wrapper implementations around existing code
- Command: Full
privacybench runwithout execution
Phase 3 (Future): π Execution Engine
- Complete experiment orchestration
- Integration with existing training code
- Command: Full experiment execution
Phase 4 (Future): π Results & Export System
- Results collection and formatting
- Multiple export formats and comparisons
Phase 5 (Future): π§ͺ Testing & Production Polish
- Comprehensive test suite
- Production-ready packaging
# Clone and install in development mode
git clone <repository>
cd privacybench
pip install -e .
# Run tests (Phase 5)
pytest tests/
# Validate installation
privacybench --help
privacybench list all- Add experiment to
legacy/experiments.yaml - Map CLI name in
cli/parser.py - Add to choices in
cli/main.py - Test:
privacybench run --experiment new_experiment --dry-run
- Create wrapper in
core/datasets/(Phase 2) - Register in component registry
- Add validation rules
- Test with existing experiments
- Quick Start: This README
- API Reference: Coming in Phase 5
- Configuration Guide:
examples/directory - Development Guide: Coming in Phase 5
MIT License - see LICENSE file for details.
Built on top of existing PrivacyBench research codebase with:
- PyTorch Lightning for training infrastructure
- Flower for federated learning
- Opacus for differential privacy
- CodeCarbon for energy tracking
- Weights & Biases for experiment logging
Note: This is Phase 1 implementation focused on CLI foundation and configuration. Actual experiment execution will be added in Phase 3.