-
Notifications
You must be signed in to change notification settings - Fork 48
Rename HyperUpdateRule to ScheduleRule #4483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>
This stack of pull requests is managed by Graphite. Learn more about stacking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
7464590 to
ce97bc2
Compare
ce97bc2 to
888fe15
Compare
888fe15 to
b4e3638
Compare
b4e3638 to
69e695f
Compare
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) ## Summary - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) ## Summary - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) ## Summary - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) ## Summary - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) ## Summary - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) ## Summary - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) ## Summary - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) ## Summary - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke simplify submission zip creation use policy_spec for submission zips tighten checkpoint io helpers shorten checkpoint arg help inline checkpoint policy helpers restore policy spec docstring validate checkpoint data_path before download require checkpoint directory URIs expand policy spec s3 docstring
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) ## Summary - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) ## Summary - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke simplify submission zip creation use policy_spec for submission zips tighten checkpoint io helpers shorten checkpoint arg help inline checkpoint policy helpers restore policy spec docstring validate checkpoint data_path before download require checkpoint directory URIs expand policy spec s3 docstring
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke simplify submission zip creation use policy_spec for submission zips tighten checkpoint io helpers shorten checkpoint arg help inline checkpoint policy helpers restore policy spec docstring validate checkpoint data_path before download require checkpoint directory URIs expand policy spec s3 docstring
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke simplify submission zip creation use policy_spec for submission zips tighten checkpoint io helpers shorten checkpoint arg help inline checkpoint policy helpers restore policy spec docstring validate checkpoint data_path before download require checkpoint directory URIs expand policy spec s3 docstring
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke simplify submission zip creation use policy_spec for submission zips tighten checkpoint io helpers shorten checkpoint arg help inline checkpoint policy helpers restore policy spec docstring validate checkpoint data_path before download require checkpoint directory URIs expand policy spec s3 docstring
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) ## Summary - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) ## Summary - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo ## Testing - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke
no more .mpt Merge remote-tracking branch 'origin/main' into richard-unifympt slim policy spec handler more concise cleanup simplify Merge remote-tracking branch 'origin/main' into richard-unifympt re-add fix policy spex Update packages/mettagrid/python/src/mettagrid/util/uri_resolvers/schemes.py Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt bundles Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt bundle Merge remote-tracking branch 'origin/richard-unifympt' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? ugh compat cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt cleanup tests Merge remote-tracking branch 'origin/main' into richard-unifympt more tests Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt simplify? Merge branch 'main' into richard-unifympt cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt no more .mpt remove all .mpt and lint cleanup local data path fixes mpt re-add re-add artifact lint Merge remote-tracking branch 'origin/main' into richard-unifympt more cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt diff cleanup ftt lint fix error Merge remote-tracking branch 'origin/main' into richard-unifympt more tests lint Merge remote-tracking branch 'origin/main' into richard-unifympt Merge remote-tracking branch 'origin/main' into richard-unifympt checkpoint policy does save/load lint checkpoint moving catcus lint Merge branch 'main' into richard-unifympt fold-in [pyright 4] Get pyright to pass on app_backend (#4478) Merge remote-tracking branch 'origin/main' into richard-unifympt Fix command, add space (#4456) added space to --app:lib--tlsEmulation:off which makes it --app:lib --tlsEmulation:off now it runs Rename HyperUpdateRule to ScheduleRule (#4483) - rename HyperUpdateRule to ScheduleRule and apply to TrainerConfig via target_path - update recipes and teacher scheduling to use ScheduleRule - report PPO stats using ppo_actor/ppo_critic hyperparam keys and update tests - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Merge remote-tracking branch 'origin/main' into richard-unifympt Fix supervisor teacher behavior and legacy BC mode (#4484) - gate PPO actor during supervisor teacher phase - fix supervisor/no-teacher behavior and add legacy BC (no gating, no PPO resume) - require supervisor policy URI for sliced_cloner_no_ppo - not run (not requested) --------- Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com> Co-authored-by: Adam S <134907338+gustofied@users.noreply.github.com> Minor fixes to the slstm triton kernel, causing failures for certain kernel sizes (#4492) cleanup Merge remote-tracking branch 'origin/main' into richard-unifympt fold in training environments and eval environments mismatched (#4487) I ran a direct config comparison using the training entrypoint (recipes/experiment/cogs_v_clips.train) with variants=["heart_chorus"] and compared the eval suite config it builds (difficulty standard + heart_chorus) for an overlapping mission: hello_world.oxygen_bottleneck. Findings: - Compass is ON in both training and eval (global_obs.compass=True). - Vibe count and change‑vibe settings match (152 vibes; change_vibe.number_of_vibes=152). - But the mission parameters differ between training and eval for the same mission name: - game.objects.carbon_extractor.max_uses: train 25 vs eval 100 - game.objects.oxygen_extractor.max_uses: train 5 vs eval 20 - game.objects.germanium_extractor.max_uses: train 5 vs eval 20 - game.objects.silicon_extractor.max_uses: train 120 vs eval 480 So the mismatch isn’t compass — it’s the mission definitions used by training vs eval. Training uses base missions (cogames.cogs_vs_clips.missions), while eval uses integrated eval missions (cogames.cogs_vs_clips.evals.integrated_evals) that have different extractor settings. Also: the eval suite used by recipes/experiment/cogs_v_clips.train does not include machina_1.open_world at all (it only evaluates the 7 HELLO_WORLD integrated evals). So training can be creating hearts on easier missions while your eval runs on machina_1 are a different environment entirely. Relevant files: - Compass default: packages/cogames/src/cogames/cogs_vs_clips/mission.py - Training entrypoint + eval suite wiring: recipes/experiment/cogs_v_clips.py - Eval mission definitions: packages/cogames/src/cogames/cogs_vs_clips/evals/integrated_evals.py If you want true parity, we should align which mission templates eval uses (and/or include machina_1.open_world in the eval suite). I can patch this if you want — tell me whether you prefer: 1. Eval suite uses the same mission templates as training (from missions.py), or 2. Training uses the integrated eval mission definitions, or 3. Add machina_1.open_world to the eval suite. ripping out Merge remote-tracking branch 'origin/main' into richard-unifympt simplify fix and lint choke

Summary
Testing