Skip to content

Conversation

@relh
Copy link
Contributor

@relh relh commented Dec 22, 2025

Summary

This PR completes the migration from legacy .mpt checkpoint artifacts to CheckpointPolicy bundles across Metta. Checkpoints are now represented as a directory (locally) or a zip (remotely) that contains:

  • policy_spec.json
  • weights.safetensors

The save/load pipeline, URI resolution, tooling (train/play/eval), and tests are updated to treat these bundles as the single canonical format. The result is a simpler, more consistent interface for saving, loading, and resolving policies across local runs, S3, and metta://.

Motivation

The old .mpt artifact format required custom parsing paths, had special cases sprinkled across CLI tools/tests, and didn’t align with our submission zip pipeline. This PR unifies all policy checkpoints under the submission bundle format so that:

  • policy loading and evaluation only need a PolicySpec + weights bundle
  • :latest, metta://, and direct file:///s3:// paths all resolve to the same format
  • distributed save/load stays compatible without extra callsite branching
  • the load path is slimmer and more consistent (no implicit .mpt fallback)

New Checkpoint Bundle Format (the new source of truth)

Checkpoint bundle layout (local dir or zip):

<run>:v<epoch>/
  policy_spec.json
  weights.safetensors

Local save path
.../checkpoints/<run>:v<epoch>/

Remote save path
s3://.../<run>:v<epoch>.zip

This mirrors the submission policy format and is now used by training, evaluation, and policy loaders everywhere.

Core Save/Load Changes

CheckpointManager / Checkpointer

  • CheckpointManager.save_policy_checkpoint now writes a checkpoint bundle locally and (if remote storage is enabled) writes a zip of that bundle to S3.
  • CheckpointManager.get_latest_checkpoint now looks for the latest bundle dir or zip (not .mpt).
  • Checkpointer.load_or_create_policy now loads policy bundles via policy_spec_from_uri and safetensors.
    Distributed resume paths load weights then call initialize_to_environment for correct env‑dependent buffers.

DDP State Dict Compatibility

  • DistributedPolicy.state_dict() and load_state_dict() now delegate to the underlying module so keys match the non‑DDP state dict layout. This keeps strict loading consistent without callsite logic.

URI Resolution & Schemes (new behavior)

resolve_uri / policy_spec_from_uri

  • URIs now resolve to checkpoint bundle dirs or zip files.
  • policy_spec_from_uri loads from:
    • local bundle dir
    • local zip
    • S3 zip (downloaded then unpacked)
    • mock:// class path
  • .mpt is no longer accepted.

:latest resolution

:latest resolves to the highest epoch bundle from:

  • local directories that contain policy_spec.json
  • .zip checkpoints on S3

checkpoint_uri_for_epoch

New helper to compute “same run” checkpoint URI for a target epoch (works for file:// and s3://).

Tooling & CLI Updates

cogames policy CLI (cogames/cli/policy.py)

  • CLI help now describes bundle dirs / zip URIs.
  • list_checkpoints() scans for policy_spec.json instead of .mpt.
  • URI parsing now expects bundle format; key=value format is simpler and resolves class names directly.

play, train, and evaluation scripts

  • metta/tools/play.py: loads via policy_spec_from_uri (bundle format), removes .mpt path support.
  • metta/tools/train.py: simplified URI/path detection.
  • packages/cogames/scripts/run_evaluation.py: loads bundle policies; action‑space detection now reads safetensors via policy_spec.

Policy Loading Internals

mettagrid.policy.loader.initialize_or_load_policy

  • If policy_spec.init_kwargs includes architecture_spec, we now:
    1. construct policy via PolicyArchitecture.from_spec
    2. load safetensors weights
    3. reinitialize environment buffers
  • No more .mpt policy class or .mpt logic.

prepare_policy_spec

  • load_policy_spec_from_path now handles both local dirs and zips in a single path.
  • S3 download helper enforces .zip for remote bundles.
  • data_path resolution is strict and normalized.

metta:// Scheme Resolution

metta://policy/<name> or metta://policy/<uuid> now resolves to:

  1. scripted aliases (baseline, ladybug, thinky, racecar, starter)
  2. local checkpoint bundles (if present in data dir)
  3. S3 submission zip via policy version

The old .mpt fallback is removed.

Removed Legacy Code

  • Removed mpt_artifact.py, mpt_policy.py, and .mpt tests.
  • Removed .mpt parsing and backwards‑compat logic from loaders and schemes.

Recipes and Tests

  • Recipes updated to point at bundle URIs (zip/dir), not .mpt.
  • Integration tests updated for bundle format.
  • Tests asserting .mpt behavior removed.

Behavior Changes / Migration Notes

  • .mpt checkpoints are no longer supported.
  • Any saved policy should now be stored as a checkpoint bundle (dir or zip).
  • If you have legacy .mpt paths, convert them to bundles or re‑run training with the new checkpointing system.

Examples

Load latest local checkpoint:

policy_uri=file://.../train_dir/<run>/checkpoints:latest

Load specific local checkpoint:

policy_uri=file://.../train_dir/<run>/checkpoints/<run>:v60

Load specific S3 checkpoint:

policy_uri=s3://bucket/path/checkpoints/<run>:v60.zip

Load from metta://

policy_uri=metta://policy/<name>
policy_uri=metta://policy/<uuid>

Testing

  • metta pytest

Asana Task

Copy link
Contributor Author

relh commented Dec 22, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@relh relh changed the title init Add CheckpointPolicy + policy_spec checkpoint bundles Dec 22, 2025
@relh relh changed the title Add CheckpointPolicy + policy_spec checkpoint bundles MptPolicy -> CheckpointPolicy (+misc) Dec 22, 2025
@relh relh force-pushed the checkpoint-policy-core-v3 branch from 42ddbb1 to 7c95ff5 Compare December 22, 2025 22:29
@relh relh marked this pull request as ready for review December 22, 2025 22:31
@datadog-official
Copy link

datadog-official bot commented Dec 22, 2025

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 12c51f3 | Docs | Was this helpful? Give us feedback!

@relh relh changed the title MptPolicy -> CheckpointPolicy (+misc) CheckpointPolicy bundles for checkpoint save/load Dec 23, 2025
@relh relh force-pushed the checkpoint-policy-core-v3 branch from 85edb50 to feeac87 Compare December 23, 2025 01:16
@relh relh changed the title CheckpointPolicy bundles for checkpoint save/load MptPolicy -> CheckpointPolicy (+ checkpoint bundles) Dec 23, 2025
@relh relh force-pushed the checkpoint-policy-core-v3 branch from af9fbc7 to c1a3731 Compare December 23, 2025 03:37
@relh relh changed the title MptPolicy -> CheckpointPolicy (+ checkpoint bundles) CheckpointPolicy core + checkpoint bundles Dec 23, 2025
@relh relh changed the title CheckpointPolicy core + checkpoint bundles n Dec 23, 2025
@relh relh changed the title n CheckpointPolicy core + checkpoint bundles Dec 23, 2025
@relh relh changed the title CheckpointPolicy core + checkpoint bundles MptPolicy -> CheckpointPolicy (core + checkpoint bundles) Dec 23, 2025
@relh relh changed the title MptPolicy -> CheckpointPolicy (core + checkpoint bundles) CheckpointPolicy core: bundle IO + loading Dec 23, 2025
@relh relh changed the title CheckpointPolicy core: bundle IO + loading MptPolicy -> CheckpointPolicy (core: bundle IO + loading) Dec 23, 2025
@relh relh changed the title MptPolicy -> CheckpointPolicy (core: bundle IO + loading) CheckpointPolicy core: checkpoint bundles in training Dec 23, 2025
@relh relh changed the title CheckpointPolicy core: checkpoint bundles in training MptPolicy -> CheckpointPolicy, Checkpointer/Checkpoint Manager use it Dec 23, 2025
@relh relh enabled auto-merge December 23, 2025 21:49
@relh relh assigned relh and unassigned rhysh and relh Dec 27, 2025
@relh relh added the review wanted: stamp This PR needs a review from any available team member label Dec 28, 2025
@relh relh changed the title MptPolicy -> CheckpointPolicy, Checkpointer/Checkpoint Manager use it MptPolicy -> CheckpointPolicy, Checkpointer/Checkpoint Manager/everywhere uses it Dec 29, 2025
@relh relh changed the title MptPolicy -> CheckpointPolicy, Checkpointer/Checkpoint Manager/everywhere uses it CheckpointPolicy bundles everywhere (replace .mpt save/load) Dec 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review wanted: stamp This PR needs a review from any available team member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants