feat(server): add VenvStackExecutor for stateful, sandboxed code execution by bxff · Pull Request #74 · tilesprivacy/tiles

bxff · 2026-01-19T17:50:59Z

Features

VenvStackExecutor class for session-based code execution
Shared Python environments via venvstacks with automatic fallback to per-session venvs
Persistent state across multiple tool calls within a session
Kernel-level sandboxing via Apple's sandbox-exec (macOS)
Automatic cleanup of session environments
Network access denial and file system restrictions
Thread-safe executor cache for concurrent sessions

Security Hardening

Path injection prevention with strict regex validation
Path traversal prevention via session ID sanitization
RestrictedUnpickler for safe deserialization of subprocess output
Session-isolated temp directories

VenvStacks Architecture

When venvstacks are built and exported, sessions share pre-built Python environments:

Runtime layer: Python 3.13
Framework layers: datascience (numpy, pandas, matplotlib, scikit-learn, scipy, seaborn, requests, pillow, sympy) or minimal (numpy, requests)
Application layers: interpreter and interpreter-minimal

This reduces session startup time and disk usage. Falls back to per-session venvs when stacks aren't available.

Components

engine.py: VenvStackExecutor class, security helpers, execute_sandboxed_venvstack function
venvstacks_manager.py: VenvStacksManager for building/exporting portable environments
venvstacks.toml: Stack configuration with datascience and minimal frameworks
sandbox.sb: Template sandbox profile for macOS isolation
test_venvstack_sandbox.py: Comprehensive test suite (22 tests)

This provides the foundation for safe, stateful code interpreter functionality.

…ution Introduces a secure code execution engine using venvstacks with macOS sandbox-exec isolation. Features: - VenvStackExecutor class for session-based code execution - Persistent state across multiple tool calls within a session - Isolated virtual environments per session via venvstacks - Kernel-level sandboxing via Apple's sandbox-exec (macOS) - Automatic cleanup of session environments - Network access denial and file system restrictions Components: - engine.py: VenvStackExecutor class, execute_sandboxed_venvstack function - sandbox.sb: Template sandbox profile for macOS isolation - test_venvstack_sandbox.py: Comprehensive test suite This provides the foundation for safe, stateful code interpreter functionality.

- Remove unused 'types' import - Add cross-platform support (Windows Scripts vs Unix bin) - Use base64+pickle for state serialization instead of repr() - Fix possibly unbound variable 'profile_path' issue - Replace bare except clauses with specific exception handling - Add cleanup_executor() function to manage session cache - Align use_sandbox default to False consistently - Improve docstrings with clearer parameter documentation

coderabbitai · 2026-01-19T17:51:08Z

📝 Walkthrough

Walkthrough

Adds a stateful, per-session sandboxed execution system centered on a new VenvStackExecutor and supporting utilities. Implements session ID sanitization, safe pickle deserialization, sandbox profile generation, per-session temp directories, package installation logic, session state persistence, and optional macOS sandbox-exec integration. Introduces a VenvStacksManager to manage venvstacks lifecycles and resolve interpreters. Exposes get_or_create_executor, cleanup_executor, and execute_sandboxed_venvstack, and integrates the executor into the existing execute_sandboxed_code flow. Adds sandbox profile, venvstacks config, tests, and a new dependency on venvstacks.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as MemAgentAPI
    participant Manager as VenvStacksManager
    participant Executor as VenvStackExecutor
    participant Sandbox as sandbox-exec / subprocess
    participant FS as Workspace/Temp FS

    Client->>API: execute_sandboxed_venvstack(code, session_id, workspace, timeout, use_sandbox)
    API->>Executor: get_or_create_executor(session_id, workspace)
    Executor->>Manager: get_interpreter_python(framework) / ensure_stacks_ready (if using venvstacks)
    Manager-->>Executor: interpreter_path or fallback info
    Executor->>FS: create session temp dirs, validate paths, load session state (safe pickle)
    Executor->>FS: prepare sandbox profile (if use_sandbox)
    Executor->>Sandbox: launch python subprocess (optionally via sandbox-exec) with venv/venvstack
    Sandbox->>FS: read/write workspace, venv, session state files
    Sandbox-->>Executor: stdout/stderr, exit code, serialized session state
    Executor->>FS: persist new session state (safe pickle), cleanup per policy
    Executor-->>API: (result dict, combined output)
    API-->>Client: return execution result

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately summarizes the main change: adding VenvStackExecutor for stateful, sandboxed code execution, which is the primary focus of all modifications.
Description check	✅ Passed	The PR description is comprehensive and directly related to the changeset, detailing features, security hardening, architecture, and components that all match the file modifications.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🤖 Fix all issues with AI agents

In `@server/mem_agent/engine.py`:
- Around line 417-426: The sandbox profile substitution in _get_sandbox_profile
uses raw self.venv_path and self.workspace_path which can contain special
characters and enable sandbox bypass; before doing the replacements (and before
reading/using the profile), validate or escape these paths: in
_get_sandbox_profile validate self.venv_path and self.workspace_path against a
strict safe-path pattern (e.g. allow only alphanumerics, hyphen, underscore,
dot, slash) or explicitly reject/control newlines, quotes, parentheses and other
metacharacters; if a path fails validation, log a warning via logger.warning and
return "" (or raise), otherwise perform the profile.replace using the validated
values; reference SANDBOX_PROFILE_PATH, _get_sandbox_profile, self.venv_path and
self.workspace_path when making the change.
- Around line 606-621: The executor cache (_executor_cache) and accessors
get_or_create_executor and cleanup_executor are not thread-safe; protect
accesses by adding a module-level lock (e.g., _executor_cache_lock =
threading.Lock()) and use it to guard the check-then-create path in
get_or_create_executor (acquire lock before checking session_id and creating
VenvStackExecutor, then release), and also acquire the same lock inside
cleanup_executor when calling cleanup() and deleting the entry to avoid race
conditions and duplicate executor creation for the same session_id; ensure you
import threading and use the same lock for both functions.
- Around line 375-380: The code constructs venv_path using the raw session_id
which permits path traversal; change the initialization in the constructor that
sets self.session_id and self.venv_path to validate or sanitize session_id
(e.g., allow only a safe whitelist like [A-Za-z0-9_-] or replace/encode unsafe
chars) and then compute venv_path; additionally, after joining base_venv_path
and the sanitized id, verify the resulting path is inside base_venv_path using a
secure check (os.path.normpath + os.path.commonpath) and raise an error if
validation fails; update references to self.session_id/self.venv_path
accordingly.
- Around line 562-576: The pickle.loads call that deserializes subprocess output
(the pickle.loads(result.stdout) in server/mem_agent/engine.py which then calls
self._session_state.update) is an accepted-but-documented risk because the
subprocess runs user code; update the repository threat model/docs to explicitly
state this deserialization escalation vector and why it is accepted, and then
either replace pickle.loads with a safer alternative (e.g., JSON via json.loads
for simple state types and updating state accordingly) or implement a
RestrictedUnpickler that only permits safe types before calling
self._session_state.update so that if the sandbox is bypassed the parent process
isn’t trivially compromised.

In `@server/mem_agent/sandbox.sb`:
- Around line 40-44: The sandbox rule currently grants broad file-read* and
file-write* access to subpath "/private/var/folders" and "/tmp" via the allow
expression, which risks cross-session data exposure; change this to only allow
the specific session/workspace temp subdirectory (e.g., the same pattern used
for ${WORKSPACE_PATH}) by replacing the generic subpath entries with a subpath
that references the session-specific temp directory (or construct one and use
that symbol) so only files under that workspace/session are readable and
writable by the process.
- Around line 31-38: The sandbox profile is vulnerable because unescaped
${VENV_PATH} and ${WORKSPACE_PATH} are directly substituted from
StartRequest.memory_path into the profile (engine.py), allowing injection of
sandbox rules; fix it by validating and sanitizing any path before substitution:
either reject paths containing sandbox-significant characters (quotes,
parentheses, semicolons, newlines, backslashes) or escape them safely (escape
quotes, backslashes and remove/control newlines), and centralize this logic in
the function that performs the substitution (the code that replaces
"${VENV_PATH}" / "${WORKSPACE_PATH}" in engine.py) so all inputs are
normalized/validated before being interpolated into the sandbox profile.

🧹 Nitpick comments (5)

server/tests/test_venvstack_sandbox.py (2)
145-154: Unused error variable should be prefixed with underscore.

The error variable is unpacked but never used in this test. Per the static analysis hint, prefix it with an underscore to indicate it's intentionally unused.
💡 Suggested fix
-        result, error = self.executor.execute(code, use_sandbox=True)
+        result, _error = self.executor.execute(code, use_sandbox=True)
165-170: Same unused variable issue here.
💡 Suggested fix
-        result, error = self.executor.execute(code, use_sandbox=True)
+        result, _error = self.executor.execute(code, use_sandbox=True)
server/mem_agent/sandbox.sb (1)

80-81: Remove (trace deny) before production deployment.

As the comment notes, this is for debugging. In production, tracing denied operations could leak information about sandbox behavior and may have performance overhead. Consider gating this with a build-time flag or removing it entirely.
server/mem_agent/engine.py (2)
541-541: Use spread operator for cleaner list construction.

Per static analysis hint, prefer unpacking over concatenation.
💡 Suggested fix
-                    cmd = ["sandbox-exec", "-f", profile_path] + cmd
+                    cmd = ["sandbox-exec", "-f", profile_path, *cmd]
578-583: Consider using logger.exception for better stack traces.

In exception handlers, logger.exception automatically includes the stack trace. Line 582 already uses it but redundantly passes e. Lines 579 loses the traceback.
💡 Suggested fix
         except subprocess.TimeoutExpired:
-            logger.error(f"[SANDBOX] Timeout after {timeout}s")
+            logger.exception(f"[SANDBOX] Timeout after {timeout}s")
             return None, f"Execution timeout ({timeout}s)"
         except Exception as e:
-            logger.exception(f"[SANDBOX] Exception during execution: {e}")
+            logger.exception("[SANDBOX] Exception during execution")
             return None, f"Execution error: {e}"

server/mem_agent/engine.py

server/mem_agent/sandbox.sb

…tacks Security improvements: 1. **Sandbox profile path injection prevention** - Add _validate_sandbox_path() to reject paths with metacharacters - Validate venv_path, workspace_path, and session_temp_path before substituting into sandbox profile - Reject paths containing quotes, parentheses, semicolons, newlines 2. **Thread-safe executor cache** - Add _executor_cache_lock (threading.Lock) to protect cache operations - Guard get_or_create_executor() and cleanup_executor() with lock - Prevent race conditions and duplicate executor creation 3. **Path traversal prevention** - Add _sanitize_session_id() to whitelist alphanumerics, hyphen, underscore - Add _validate_path_containment() using os.commonpath for secure check - Verify venv_path stays inside base_venv_path 4. **Safer pickle deserialization** - Implement RestrictedUnpickler with whitelist of safe types - Add _safe_pickle_loads() to use restricted unpickler - Limits damage if sandbox is bypassed and malicious pickle returned 5. **Session-specific temp directory isolation** - Add session_temp_path for per-session temp files - Update sandbox.sb to use ${SESSION_TEMP_PATH} instead of broad /tmp access - Prevents cross-session data exposure via temp files - Clean up session temp directory on cleanup()

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@server/mem_agent/engine.py`:
- Around line 576-586: The venv creation subprocess.run call in engine.py (the
block that checks python_path and runs [sys.executable, "-m", "venv",
self.venv_path]) has no timeout and can hang; add a reasonable timeout parameter
to subprocess.run (e.g., timeout=30 or configurable via self or class constant),
wrap the call in a try/except that catches subprocess.TimeoutExpired, log the
timeout (including self.venv_path and the timeout value) and raise or handle
cleanup appropriately (ensure self._initialized is not set on failure); keep the
change localized to the venv creation block where subprocess.run is invoked and
reference self.venv_path, python_path, and self._initialized.

🧹 Nitpick comments (4)

server/mem_agent/engine.py (4)

432-458: Consider removing ("builtins", "type") from the safe types whitelist.

The whitelist includes ("builtins", "type") which allows deserializing type objects. While unlikely to be exploited directly, type objects are not typical session state data and including them slightly expands the attack surface.

If type objects aren't needed in session state, consider removing this entry:

     ("builtins", "bool"),
-    ("builtins", "type"),
     ("builtins", "NoneType"),

539-541: Remove unnecessary f-string prefix.

The error message string has an f prefix but contains no placeholders.

-            raise ValueError(
-                f"Invalid session_id: resulting venv_path escapes base directory"
-            )
+            raise ValueError(
+                "Invalid session_id: resulting venv_path escapes base directory"
+            )

760-777: Use logger.exception for better error logging.

When catching exceptions, use logger.exception instead of logger.error to automatically include the stack trace. This aids debugging.

🔧 Suggested fix

             except pickle.UnpicklingError as e:
                 error_msg = f"Unsafe pickle data rejected: {e}"
-                logger.error(f"[SANDBOX] Security: {error_msg}")
+                logger.exception("[SANDBOX] Security: Unsafe pickle data rejected")
                 return None, error_msg
             except Exception as e:
                 error_msg = (
                     f"Failed to parse output: {e}\nStderr: {result.stderr.decode()}"
                 )
-                logger.error(f"[SANDBOX] Parse error: {error_msg}")
+                logger.exception("[SANDBOX] Parse error")
                 return None, error_msg

         except subprocess.TimeoutExpired:
-            logger.error(f"[SANDBOX] Timeout after {timeout}s")
+            logger.exception(f"[SANDBOX] Timeout after {timeout}s")
             return None, f"Execution timeout ({timeout}s)"

733-736: Use spread operator for list concatenation.

Minor style improvement per Ruff RUF005.

-                    cmd = ["sandbox-exec", "-f", profile_path] + cmd
+                    cmd = ["sandbox-exec", "-f", profile_path, *cmd]

coderabbitai · 2026-01-19T18:13:06Z

server/mem_agent/engine.py

+        if not os.path.exists(python_path):
+            # Create venv using system Python
+            subprocess.run(
+                [sys.executable, "-m", "venv", self.venv_path],
+                check=True,
+                stdout=subprocess.DEVNULL,
+                stderr=subprocess.DEVNULL,
+            )
+            logger.info(f"Created venv at {self.venv_path}")
+
+        self._initialized = True


⚠️ Potential issue | 🟡 Minor

Add timeout to venv creation subprocess.

The subprocess.run call for venv creation has no timeout. If the venv creation hangs (e.g., due to disk issues or filesystem locks), the calling thread will block indefinitely.

🔧 Suggested fix

if not os.path.exists(python_path): # Create venv using system Python subprocess.run( [sys.executable, "-m", "venv", self.venv_path], check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, + timeout=60, # Reasonable timeout for venv creation ) logger.info(f"Created venv at {self.venv_path}")

🧰 Tools

🪛 Ruff (0.14.13)

578-578: subprocess call: check for execution of untrusted input

(S603)

🤖 Prompt for AI Agents

In `@server/mem_agent/engine.py` around lines 576 - 586, The venv creation subprocess.run call in engine.py (the block that checks python_path and runs [sys.executable, "-m", "venv", self.venv_path]) has no timeout and can hang; add a reasonable timeout parameter to subprocess.run (e.g., timeout=30 or configurable via self or class constant), wrap the call in a try/except that catches subprocess.TimeoutExpired, log the timeout (including self.venv_path and the timeout value) and raise or handle cleanup appropriately (ensure self._initialized is not set on failure); keep the change localized to the venv creation block where subprocess.run is invoked and reference self.venv_path, python_path, and self._initialized.

Add proper venvstacks integration to VenvStackExecutor: - VenvStacksManager for building/exporting portable venvstacks - venvstacks.toml config with datascience and minimal frameworks - Automatic fallback to per-session venv when venvstacks not available - Shared Python environment reduces session startup time - Package installation blocked in shared mode (by design) The executor now tries venvstacks first, falling back to creating a per-session venv if stacks aren't built. This enables: - Fast session startup with pre-built environments - Consistent package versions across all sessions - Lower disk usage (shared layers) Added tests for security helpers and venvstacks integration.

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

server/mem_agent/engine.py (1)
322-326: Use RestrictedUnpickler for legacy sandbox output parsing.

execute_sandboxed_code still deserializes user-controlled subprocess output with pickle.loads, reintroducing the escalation vector you just mitigated elsewhere.
🔧 Suggested fix
-        local_vars, error_msg = pickle.loads(result.stdout)
+        local_vars, error_msg = _safe_pickle_loads(result.stdout)
+    except pickle.UnpicklingError as e:
+        return None, f"Unsafe pickle data rejected: {e}"

🤖 Fix all issues with AI agents

In `@server/mem_agent/engine.py`:
- Around line 365-381: The whitelist currently rejects spaces and callers like
_get_sandbox_profile silently disable sandboxing when validation fails; update
_validate_sandbox_path to permit space characters in allowed paths (adjust the
_SAFE_PATH_PATTERN to include literal space) and change _get_sandbox_profile so
that when use_sandbox is True and _validate_sandbox_path(path) returns False it
fails closed (raise a clear exception such as ValueError or RuntimeError)
instead of falling back to no sandbox; apply the same fail-closed behavior to
the other related validation usage around the _get_sandbox_profile logic.
- Line 775: The execute() function currently logs raw user code with logger.info
which can leak secrets; change that call to either logger.debug or remove the
code payload and log only safe metadata (e.g., code length, language, or a
redacted preview). Update the call to logger.debug(f"[SANDBOX] execute() called
with code preview: {redacted_preview}") or log metadata like
logger.info(f"[SANDBOX] execute() called, code_length=%d", len(code)) and ensure
any preview is redacted and only emitted when logger.isEnabledFor(logging.DEBUG)
to avoid PII leakage; modify the existing logger.info usage accordingly.
- Around line 685-691: The sandbox whitelist currently always uses
venvstacks_manager.export_dir which fails for built-but-not-exported stacks;
change the selection logic in _get_sandbox_profile()/where python_env_path is
set so that when self._using_venvstacks and self._venvstacks_manager is true you
prefer export_dir only if the stack was exported (or export_dir is
present/valid), otherwise use self._venvstacks_manager.build_dir; if no
venvstacks manager or not using venvstacks, keep using self.venv_path. Update
references to python_env_path, _using_venvstacks, _venvstacks_manager,
export_dir and build_dir accordingly.
- Around line 785-817: The generated multi-line string assigned to wrapped_code
is indented inside the method, causing leading spaces in the produced script and
an IndentationError; import textwrap at the top of the file and wrap the
f-string template with textwrap.dedent(...) (i.e., set wrapped_code =
textwrap.dedent(f'''...''')) so the script has no unintended leading indentation
while keeping references like wrapped_code and self.workspace_path unchanged.
- Around line 383-411: The current _sanitize_session_id function can collapse
distinct session IDs into the same sanitized string; instead, detect when
sanitization changes the input (use _SAFE_SESSION_ID_PATTERN to check) and if it
does either raise ValueError or (preferred) append a stable hash suffix to
preserve uniqueness: compute a short hex digest (e.g. first 8 chars of
hashlib.sha256(session_id.encode()).hexdigest()) and append it (preceded by an
underscore) to the sanitized value; also add import hashlib near the top and
update tests to expect the hash-suffixed sanitized ID when invalid characters
are present.

In `@server/mem_agent/venvstacks_manager.py`:
- Around line 371-389: The function ensure_stacks_ready currently returns
success if manager.is_exported() is true even when the interpreter from
manager.get_interpreter_python(framework) is missing or invalid; change the
logic to always validate the interpreter before returning success: call
manager.get_interpreter_python(framework) and verify python and python.exists()
both when short-circuiting on manager.is_exported() and after any build/export
steps, and if the interpreter is missing attempt a rebuild/export (via
manager.build_stacks() and manager.export_stacks()) or return False with an
explicit error message referencing manager.export_dir and the missing
interpreter; ensure you reference the existing symbols manager.is_exported(),
manager.get_interpreter_python(framework), python.exists(),
manager.build_stacks(), manager.export_stacks(), and manager.export_dir when
locating and updating the code.
- Around line 126-147: The build_stacks method invokes the venvstacks CLI which
always writes output into a "__venvstacks__" subfolder, but is_built() currently
checks self.build_dir directly causing false negatives and infinite rebuilds in
ensure_stacks_ready(); update is_built() to check for (self.build_dir /
"__venvstacks__") existence (or, alternatively, switch build_stacks to use the
venvstacks Python API and pass the build_dir so outputs land in self.build_dir),
and keep references to the existing methods build_stacks, is_built, and
ensure_stacks_ready when making the change so the readiness check aligns with
where the CLI actually writes artifacts.

In `@server/tests/test_venvstack_sandbox.py`:
- Around line 141-176: The tests test_sandbox_network_denied and
test_sandbox_file_access_restricted capture two-return values from
self.executor.execute but bind the second to an unused name `error`; rename that
binding to `_` or `_error` (e.g., `result, _error = self.executor.execute(...)`
or `result, _ = ...`) in both functions to satisfy the linter while preserving
the returned result handling.

♻️ Duplicate comments (1)

server/mem_agent/engine.py (1)

615-621: Add timeout to venv creation subprocess.

This can hang indefinitely if venv creation stalls. (Duplicate of prior review note.)

🔧 Suggested fix

         if not os.path.exists(python_path):
             # Create venv using system Python
             subprocess.run(
                 [sys.executable, "-m", "venv", self.venv_path],
                 check=True,
                 stdout=subprocess.DEVNULL,
                 stderr=subprocess.DEVNULL,
+                timeout=60,
             )

server/mem_agent/engine.py

coderabbitai · 2026-01-19T18:52:48Z

server/mem_agent/engine.py

+        # Prepare code with state restoration/saving
+        wrapped_code = f'''
+import pickle
+import base64
+import sys
+import os
+
+# Restore session state safely via pickle
+_state_b64 = "{state_b64}"
+_session_state = pickle.loads(base64.b64decode(_state_b64))
+
+# Make session state available as globals
+globals().update(_session_state)
+
+# Change to workspace
+os.chdir({repr(self.workspace_path)})
+
+# Execute user code
+{code}
+
+# Capture updated state (only picklable values)
+_new_state = {{}}
+for k, v in dict(locals()).items():
+    if not k.startswith("_"):
+        try:
+            pickle.dumps(v)
+            _new_state[k] = v
+        except Exception:
+            pass
+
+# Output state
+sys.stdout.buffer.write(pickle.dumps((_new_state, None)))
+'''


⚠️ Potential issue | 🔴 Critical

Wrapped script likely raises IndentationError.

The triple-quoted string is indented within the method, so the generated script starts with leading spaces and fails at top-level. Dedent the template before writing.

🔧 Suggested fix

- wrapped_code = f''' + wrapped_code = textwrap.dedent(f""" import pickle import base64 import sys import os @@ sys.stdout.buffer.write(pickle.dumps((_new_state, None))) -''' +""")

Add import textwrap near the top of the file.

🧰 Tools

🪛 Ruff (0.14.13)

800-800: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🤖 Prompt for AI Agents

In `@server/mem_agent/engine.py` around lines 785 - 817, The generated multi-line string assigned to wrapped_code is indented inside the method, causing leading spaces in the produced script and an IndentationError; import textwrap at the top of the file and wrap the f-string template with textwrap.dedent(...) (i.e., set wrapped_code = textwrap.dedent(f'''...''')) so the script has no unintended leading indentation while keeping references like wrapped_code and self.workspace_path unchanged.

server/mem_agent/venvstacks_manager.py

server/tests/test_venvstack_sandbox.py

Security improvements: - _get_sandbox_profile now fails closed when use_sandbox=True and validation fails (raises RuntimeError/ValueError) instead of silently disabling the sandbox - _sanitize_session_id appends a hash suffix when sanitization changes the input, preventing collisions like 'a/b' and 'a_b' - _SAFE_PATH_PATTERN now allows space characters for paths like '/Users/test/My Documents' - execute() now logs only safe metadata (code_length) instead of raw user code which could leak secrets Bug fixes: - _get_sandbox_profile now prefers export_dir if exported, otherwise falls back to build_dir for built-but-not-exported stacks - is_built() now checks for __venvstacks__ marker directory to align with where venvstacks CLI actually writes artifacts - ensure_stacks_ready() now always validates the interpreter exists before returning success, attempting rebuild if missing - Removed leading indentation from wrapped_code to prevent IndentationError in generated scripts Test updates: - Updated _sanitize_session_id tests to expect hash suffix - Added test for spaces in paths - Renamed unused 'error' variables to '_error' in sandbox tests

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@server/mem_agent/venvstacks_manager.py`:
- Around line 95-106: The subprocess.run call that executes venvstacks (the call
assigning to result using full_cmd) needs a timeout to avoid hanging; add a
timeout parameter (preferably configurable via a function/class option or
derived from the venvstacks subcommand like shorter for "show" and longer for
"build"/"lock"), pass it into subprocess.run, and catch
subprocess.TimeoutExpired to log via logger.error and raise VenvStacksError
(preserving context) when check is True; ensure existing behavior for non-zero
returncode (result.returncode, result.stderr) remains unchanged.

♻️ Duplicate comments (1)

server/mem_agent/engine.py (1)
624-629: Add timeout to venv creation subprocess.

The subprocess.run for venv creation lacks a timeout. If venv creation hangs (e.g., due to disk issues or filesystem locks), the thread blocks indefinitely.
🔧 Suggested fix
             subprocess.run(
                 [sys.executable, "-m", "venv", self.venv_path],
                 check=True,
                 stdout=subprocess.DEVNULL,
                 stderr=subprocess.DEVNULL,
+                timeout=120,  # Reasonable timeout for venv creation
             )

🧹 Nitpick comments (5)

server/tests/test_venvstack_sandbox.py (1)

207-212: Test classes defined after if __name__ == "__main__" block.

The TestSecurityHelpers and subsequent test classes are defined after the if __name__ == "__main__": block (lines 208-209). While this is valid Python, it's unconventional and can cause confusion. Consider moving the if __name__ == "__main__" block to the end of the file.

♻️ Suggested reordering

Move lines 208-209 to the end of the file (after line 387) so all test classes are defined before the main entry point.
server/mem_agent/engine.py (3)
559-562: Remove extraneous f prefix from f-string without placeholders.

Line 561 uses an f-string but contains no interpolation placeholders.
🔧 Suggested fix
         if not _validate_path_containment(self.venv_path, self.base_venv_path):
             raise ValueError(
-                f"Invalid session_id: resulting venv_path escapes base directory"
+                "Invalid session_id: resulting venv_path escapes base directory"
             )
917-926: Use logging.exception in exception handlers for automatic traceback.

In exception handlers, logging.exception automatically includes the traceback. Also, line 932's logger.exception(f"... {e}") includes the exception redundantly.
♻️ Suggested fix
             except pickle.UnpicklingError as e:
                 error_msg = f"Unsafe pickle data rejected: {e}"
-                logger.error(f"[SANDBOX] Security: {error_msg}")
+                logger.exception("[SANDBOX] Security: %s", error_msg)
                 return None, error_msg
             except Exception as e:
                 error_msg = (
                     f"Failed to parse output: {e}\nStderr: {result.stderr.decode()}"
                 )
-                logger.error(f"[SANDBOX] Parse error: {error_msg}")
+                logger.exception("[SANDBOX] Parse error: %s", error_msg)
                 return None, error_msg

         except subprocess.TimeoutExpired:
-            logger.error(f"[SANDBOX] Timeout after {timeout}s")
+            logger.exception("[SANDBOX] Timeout after %ds", timeout)
             return None, f"Execution timeout ({timeout}s)"
         except Exception as e:
-            logger.exception(f"[SANDBOX] Exception during execution: {e}")
+            logger.exception("[SANDBOX] Exception during execution")
             return None, f"Execution error: {e}"
884-884: Consider using list unpacking for command construction.

Ruff suggests using ["sandbox-exec", "-f", profile_path, *cmd] instead of concatenation for cleaner code.
♻️ Suggested fix
-                cmd = ["sandbox-exec", "-f", profile_path] + cmd
+                cmd = ["sandbox-exec", "-f", profile_path, *cmd]
server/mem_agent/venvstacks_manager.py (1)
344-353: Consider thread-safe initialization for global manager.

The get_venvstacks_manager() function has a potential race condition if called from multiple threads simultaneously. While the manager is mostly stateless after initialization (just paths), consider using a lock for consistency with the thread-safe patterns used in engine.py.
♻️ Suggested fix
+import threading
+
 # Global manager instance (lazy initialization)
 _global_manager: Optional[VenvStacksManager] = None
+_global_manager_lock = threading.Lock()


 def get_venvstacks_manager() -> VenvStacksManager:
     """Get or create the global VenvStacksManager instance."""
     global _global_manager
-    if _global_manager is None:
-        _global_manager = VenvStacksManager()
+    if _global_manager is None:
+        with _global_manager_lock:
+            if _global_manager is None:
+                _global_manager = VenvStacksManager()
     return _global_manager

coderabbitai · 2026-01-19T19:14:24Z

server/mem_agent/venvstacks_manager.py

+        result = subprocess.run(
+            full_cmd,
+            capture_output=True,
+            text=True,
+        )
+
+        if result.returncode != 0:
+            logger.error(f"venvstacks failed: {result.stderr}")
+            if check:
+                raise VenvStacksError(f"venvstacks command failed: {result.stderr}")
+
+        return result


⚠️ Potential issue | 🟡 Minor

Add timeout to venvstacks subprocess calls.

The subprocess.run call lacks a timeout. Venvstacks operations (especially build and lock) can take significant time or hang, blocking the caller indefinitely.

🔧 Suggested fix

+ # Long timeout for build operations which can take several minutes + timeout = 600 # 10 minutes + result = subprocess.run( full_cmd, capture_output=True, text=True, + timeout=timeout, )

Consider making the timeout configurable or varying it based on the command type (e.g., shorter for show, longer for build).

🧰 Tools

🪛 Ruff (0.14.13)

95-95: subprocess call: check for execution of untrusted input

(S603)

104-104: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In `@server/mem_agent/venvstacks_manager.py` around lines 95 - 106, The subprocess.run call that executes venvstacks (the call assigning to result using full_cmd) needs a timeout to avoid hanging; add a timeout parameter (preferably configurable via a function/class option or derived from the venvstacks subcommand like shorter for "show" and longer for "build"/"lock"), pass it into subprocess.run, and catch subprocess.TimeoutExpired to log via logger.error and raise VenvStacksError (preserving context) when check is True; ensure existing behavior for non-zero returncode (result.returncode, result.stderr) remains unchanged.

bxff added 2 commits January 19, 2026 23:05

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

madclaws linked an issue Jan 20, 2026 that may be closed by this pull request

Use venvstack for portable python runtime #21

Closed

Comments

Conversation

bxff commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Features

Security Hardening

VenvStacks Architecture

Components

Uh oh!

coderabbitai bot commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bxff commented Jan 19, 2026 •

edited

Loading

coderabbitai bot commented Jan 19, 2026 •

edited

Loading