Tighten checkpoint URI resolution + docs #4496

relh · 2025-12-22T21:36:11Z

What

tighten metta:// resolution for checkpoint directory URIs
adjust run_evaluation to use updated resolver semantics
update URI resolver documentation for checkpoint directory behavior
normalize checkpoint URI handling in kickstarter flows

Why

eliminate ambiguous URI forms and accidental file loads
make :latest and directory resolution predictable
keep docs and tooling aligned with resolver behavior
reduce confusion when switching between local and S3 checkpoints

relh · 2025-12-22T21:36:27Z

Tighten checkpoint URI resolution + docs #4496 👈 (View in Graphite)
Use CheckpointPolicy broadly #4508 : 1 other dependent PR (#4497 )
CheckpointPolicy bundles everywhere (replace .mpt save/load) #4502
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

packages/mettagrid/python/src/mettagrid/policy/prepare_policy_spec.py

no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke simplify submission zip creation use policy_spec for submission zips tighten checkpoint io helpers shorten checkpoint arg help inline checkpoint policy helpers restore policy spec docstring validate checkpoint data_path before download require checkpoint directory URIs expand policy spec s3 docstring

relh · 2025-12-23T21:27:14Z

Folded into #4508 as part of stack consolidation.

relh changed the title ~~init~~ Refactor checkpoint IO to CheckpointPolicy + policy_spec dirs Dec 22, 2025

This was referenced Dec 22, 2025

Remove .mpt paths, expand metta:// policy resolution #4497

Closed

Recipes: switch to checkpoint directory URIs #4498

Closed

relh marked this pull request as ready for review December 22, 2025 21:39

github-actions bot assigned relh Dec 22, 2025

chatgpt-codex-connector bot reviewed Dec 22, 2025

View reviewed changes

packages/mettagrid/python/src/mettagrid/policy/prepare_policy_spec.py Show resolved Hide resolved

This comment has been minimized.

Sign in to view

relh changed the title ~~Refactor checkpoint IO to CheckpointPolicy + policy_spec dirs~~ Add CheckpointPolicy + policy_spec checkpoint bundles Dec 22, 2025

relh changed the title ~~Add CheckpointPolicy + policy_spec checkpoint bundles~~ Eval submission + docs for policy_spec checkpoints Dec 22, 2025

relh changed the base branch from main to graphite-base/4496 December 22, 2025 22:29

relh force-pushed the checkpoint-policy-refactor-v2 branch from c090a30 to e843647 Compare December 22, 2025 22:29

relh changed the base branch from graphite-base/4496 to checkpoint-policy-core-v3 December 22, 2025 22:29

relh mentioned this pull request Dec 22, 2025

CheckpointPolicy bundles everywhere (replace .mpt save/load) #4502

Open

relh force-pushed the checkpoint-policy-refactor-v2 branch from e843647 to b0d29b6 Compare December 22, 2025 22:42

relh changed the base branch from checkpoint-policy-core-v3 to graphite-base/4496 December 22, 2025 22:51

relh force-pushed the checkpoint-policy-refactor-v2 branch from b0d29b6 to a83eb23 Compare December 22, 2025 22:52

relh changed the base branch from graphite-base/4496 to checkpoint-policy-core-v3 December 22, 2025 22:52

relh changed the base branch from checkpoint-policy-core-v3 to graphite-base/4496 December 22, 2025 22:55

relh force-pushed the checkpoint-policy-refactor-v2 branch from a83eb23 to 74c593a Compare December 22, 2025 22:56

relh changed the base branch from graphite-base/4496 to checkpoint-policy-core-v3 December 22, 2025 22:56

relh force-pushed the checkpoint-policy-refactor-v2 branch from 74c593a to d740a8b Compare December 22, 2025 23:00

relh changed the base branch from checkpoint-policy-core-v3 to graphite-base/4496 December 22, 2025 23:15

relh force-pushed the checkpoint-policy-refactor-v2 branch from d740a8b to ae9e401 Compare December 22, 2025 23:15

relh changed the base branch from graphite-base/4496 to checkpoint-policy-core-v3 December 22, 2025 23:15

relh changed the base branch from checkpoint-policy-core-v3 to graphite-base/4496 December 22, 2025 23:19

relh force-pushed the checkpoint-policy-refactor-v2 branch from ae9e401 to 6287321 Compare December 22, 2025 23:20

relh changed the base branch from graphite-base/4496 to checkpoint-policy-core-v3 December 22, 2025 23:20

relh changed the base branch from checkpoint-policy-core-v3 to graphite-base/4496 December 22, 2025 23:28

relh force-pushed the checkpoint-policy-refactor-v2 branch from 6287321 to 26cf583 Compare December 22, 2025 23:28

relh force-pushed the checkpoint-policy-refactor-v2 branch 2 times, most recently from 5cb5b72 to 16c7376 Compare December 23, 2025 18:57

relh force-pushed the checkpoint-policy-integration-v1 branch 2 times, most recently from af52d5a to f475d0f Compare December 23, 2025 19:00

relh force-pushed the checkpoint-policy-refactor-v2 branch 2 times, most recently from 7a99c61 to e892624 Compare December 23, 2025 19:05

relh force-pushed the checkpoint-policy-integration-v1 branch from f475d0f to 77390d8 Compare December 23, 2025 19:05

relh force-pushed the checkpoint-policy-refactor-v2 branch from e892624 to e27fe0a Compare December 23, 2025 19:06

relh force-pushed the checkpoint-policy-integration-v1 branch from 77390d8 to fd9aef2 Compare December 23, 2025 19:06

relh force-pushed the checkpoint-policy-refactor-v2 branch from e27fe0a to caceb1f Compare December 23, 2025 19:07

relh force-pushed the checkpoint-policy-integration-v1 branch from fd9aef2 to e72da62 Compare December 23, 2025 19:07

relh force-pushed the checkpoint-policy-refactor-v2 branch from caceb1f to ad47dcf Compare December 23, 2025 19:10

relh force-pushed the checkpoint-policy-integration-v1 branch 2 times, most recently from 51a88e0 to 68bd039 Compare December 23, 2025 19:12

relh force-pushed the checkpoint-policy-refactor-v2 branch from ad47dcf to 1e36374 Compare December 23, 2025 19:12

relh force-pushed the checkpoint-policy-integration-v1 branch from 68bd039 to 7ed4f29 Compare December 23, 2025 19:15

relh force-pushed the checkpoint-policy-refactor-v2 branch from 1e36374 to a8d6ad8 Compare December 23, 2025 19:15

relh changed the title ~~Refine checkpoint URI evaluation flows~~ Refine checkpoint URI resolution and docs Dec 23, 2025

relh force-pushed the checkpoint-policy-integration-v1 branch from 7ed4f29 to d6ec1b1 Compare December 23, 2025 20:20

relh force-pushed the checkpoint-policy-refactor-v2 branch 2 times, most recently from 34a8387 to f313656 Compare December 23, 2025 20:28

relh force-pushed the checkpoint-policy-integration-v1 branch 2 times, most recently from e466a27 to 7971455 Compare December 23, 2025 20:53

relh force-pushed the checkpoint-policy-refactor-v2 branch from f313656 to 9acd982 Compare December 23, 2025 20:53

relh force-pushed the checkpoint-policy-integration-v1 branch from 7971455 to 19fa3a3 Compare December 23, 2025 20:56

relh force-pushed the checkpoint-policy-refactor-v2 branch from 9acd982 to 670339d Compare December 23, 2025 20:56

relh changed the title ~~Refine checkpoint URI resolution and docs~~ Tighten checkpoint URI resolution + docs Dec 23, 2025

relh closed this Dec 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tighten checkpoint URI resolution + docs #4496

Tighten checkpoint URI resolution + docs #4496

Uh oh!

relh commented Dec 22, 2025 •

edited

Loading

Uh oh!

relh commented Dec 22, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

This comment has been minimized.

relh commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Tighten checkpoint URI resolution + docs #4496

Tighten checkpoint URI resolution + docs #4496

Uh oh!

Conversation

relh commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Uh oh!

relh commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

This comment has been minimized.

relh commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

relh commented Dec 22, 2025 •

edited

Loading

relh commented Dec 22, 2025 •

edited

Loading