-
Notifications
You must be signed in to change notification settings - Fork 48
CheckpointPolicy bundles everywhere (replace .mpt save/load) #4502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
relh
wants to merge
106
commits into
main
Choose a base branch
from
checkpoint-policy-core-v3
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+698
−1,238
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
Author
This stack of pull requests is managed by Graphite. Learn more about stacking. |
42ddbb1 to
7c95ff5
Compare
This was referenced Dec 22, 2025
packages/mettagrid/python/src/mettagrid/policy/checkpoint_policy.py
Outdated
Show resolved
Hide resolved
85edb50 to
feeac87
Compare
af9fbc7 to
c1a3731
Compare
packages/mettagrid/python/src/mettagrid/policy/checkpoint_policy.py
Outdated
Show resolved
Hide resolved
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.

Summary
This PR completes the migration from legacy
.mptcheckpoint artifacts to CheckpointPolicy bundles across Metta. Checkpoints are now represented as a directory (locally) or a zip (remotely) that contains:policy_spec.jsonweights.safetensorsThe save/load pipeline, URI resolution, tooling (train/play/eval), and tests are updated to treat these bundles as the single canonical format. The result is a simpler, more consistent interface for saving, loading, and resolving policies across local runs, S3, and
metta://.Motivation
The old
.mptartifact format required custom parsing paths, had special cases sprinkled across CLI tools/tests, and didn’t align with our submission zip pipeline. This PR unifies all policy checkpoints under the submission bundle format so that::latest,metta://, and directfile:///s3://paths all resolve to the same format.mptfallback)New Checkpoint Bundle Format (the new source of truth)
Checkpoint bundle layout (local dir or zip):
Local save path
.../checkpoints/<run>:v<epoch>/Remote save path
s3://.../<run>:v<epoch>.zipThis mirrors the submission policy format and is now used by training, evaluation, and policy loaders everywhere.
Core Save/Load Changes
CheckpointManager / Checkpointer
CheckpointManager.save_policy_checkpointnow writes a checkpoint bundle locally and (if remote storage is enabled) writes a zip of that bundle to S3.CheckpointManager.get_latest_checkpointnow looks for the latest bundle dir or zip (not.mpt).Checkpointer.load_or_create_policynow loads policy bundles viapolicy_spec_from_uriand safetensors.Distributed resume paths load weights then call
initialize_to_environmentfor correct env‑dependent buffers.DDP State Dict Compatibility
DistributedPolicy.state_dict()andload_state_dict()now delegate to the underlying module so keys match the non‑DDP state dict layout. This keeps strict loading consistent without callsite logic.URI Resolution & Schemes (new behavior)
resolve_uri/policy_spec_from_uripolicy_spec_from_uriloads from:mock://class path.mptis no longer accepted.:latestresolution:latestresolves to the highest epoch bundle from:policy_spec.json.zipcheckpoints on S3checkpoint_uri_for_epochNew helper to compute “same run” checkpoint URI for a target epoch (works for
file://ands3://).Tooling & CLI Updates
cogamespolicy CLI (cogames/cli/policy.py)list_checkpoints()scans forpolicy_spec.jsoninstead of.mpt.play,train, and evaluation scriptsmetta/tools/play.py: loads viapolicy_spec_from_uri(bundle format), removes.mptpath support.metta/tools/train.py: simplified URI/path detection.packages/cogames/scripts/run_evaluation.py: loads bundle policies; action‑space detection now reads safetensors via policy_spec.Policy Loading Internals
mettagrid.policy.loader.initialize_or_load_policypolicy_spec.init_kwargsincludesarchitecture_spec, we now:PolicyArchitecture.from_spec.mptpolicy class or.mptlogic.prepare_policy_specload_policy_spec_from_pathnow handles both local dirs and zips in a single path..zipfor remote bundles.data_pathresolution is strict and normalized.metta:// Scheme Resolution
metta://policy/<name>ormetta://policy/<uuid>now resolves to:The old
.mptfallback is removed.Removed Legacy Code
mpt_artifact.py,mpt_policy.py, and.mpttests..mptparsing and backwards‑compat logic from loaders and schemes.Recipes and Tests
.mpt..mptbehavior removed.Behavior Changes / Migration Notes
.mptcheckpoints are no longer supported..mptpaths, convert them to bundles or re‑run training with the new checkpointing system.Examples
Load latest local checkpoint:
Load specific local checkpoint:
Load specific S3 checkpoint:
Load from metta://
Testing
metta pytestAsana Task