Add environment state snapshotting for RL research#53
Add environment state snapshotting for RL research#53sarvanithin wants to merge 2 commits intowithmartian:mainfrom
Conversation
Implements snapshot/restore functionality to save and replay episodes from specific checkpoints. Useful for debugging, trajectory analysis, and mechanistic interpretability. - Add EnvironmentSnapshot dataclass for serializing env state - Implement export_state() and load_from_state() methods on CodeBaseEnv - Support both SWEBench and Harbor environments - Save container filesystem as tarball with JSON metadata - Snapshots only work at episode boundaries (can't snapshot mid-episode) - Add comprehensive test coverage Closes withmartian#39
|
Thanks for taking a crack at this @sarvanithin! This is definitely going to be a complex feature that will take a bit more work, and because of that we were delaying until after the planned repo launch on Thursday to focus on it. If you want to continue working on it in the meantime, there are a couple issues with the current approach here:
|
Moved load_from_state to SweBenchEnv and HarborEnv since they need different constructor arguments (tasks list). Base class now provides _restore_from_snapshot helper that subclasses call after init. - Add load_from_state implementation to SweBenchEnv - Add load_from_state implementation to HarborEnv - Refactor base class to use _restore_from_snapshot helper - Add test for load_from_state functionality
| fs_path = snap.snapshot_dir / "container_fs.tar.gz" | ||
| if fs_path.exists(): | ||
| await container.upload_dir(fs_path, "/") |
There was a problem hiding this comment.
[Logic] This will fail at runtime. download_dir("/", fs_path) writes a tarball to fs_path, but both our Docker and Daytona container implementations expect upload_dir(local_path, remote_path) to be called with local_path pointing to a directory so they can walk it and stream a new tar archive (see the existing usage in HarborEnv._compute_reward, where we upload an actual directory). When you hand them a .tar.gz file here they hit os.walk/tar.add on a file and raise NotADirectoryError, so restoration aborts before the filesystem is restored. Please unpack the archive to a temporary directory (or stream it directly via the container API) and pass that directory to upload_dir instead of the tarball path.
Context for Agents
This will fail at runtime. `download_dir("/", fs_path)` writes a tarball to `fs_path`, but both our Docker and Daytona container implementations expect `upload_dir(local_path, remote_path)` to be called with `local_path` pointing to a directory so they can walk it and stream a new tar archive (see the existing usage in `HarborEnv._compute_reward`, where we upload an actual directory). When you hand them a `.tar.gz` file here they hit `os.walk`/`tar.add` on a file and raise `NotADirectoryError`, so restoration aborts before the filesystem is restored. Please unpack the archive to a temporary directory (or stream it directly via the container API) and pass that directory to `upload_dir` instead of the tarball path.
File: src/ares/environments/base.py
Line: 496
Summary
Implements snapshot/restore functionality to save and replay episodes from specific checkpoints. Addresses issue #39.
Changes
EnvironmentSnapshotdataclass for serializing environment stateexport_state()andload_from_state()methods onCodeBaseEnvexamples/03_state_snapshotting.pyKey Features
Limitations
Test Plan
Usage
Closes #39