-
Notifications
You must be signed in to change notification settings - Fork 1
Feature - ENV Selective Inheritence #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ExactDoug
wants to merge
23
commits into
standardbeagle:main
Choose a base branch
from
ExactDoug:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add MergeEnvironment() to merge parent env with overrides - Update Connect() to use inherited environment - Handle Windows case-insensitive env vars (PATH vs Path) - Add comprehensive unit tests with platform-specific cases - Maintain 100% backward compatibility (no schema changes) Fixes: Environment variables now inherit from parent process and apply per-server overrides, ensuring servers receive critical system variables (PATH, HOME, SSL certs, proxies) while allowing custom configuration. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add tier-based inheritance modes (none, tier1, tier1+tier2, all) - Tier1: Baseline vars (PATH, HOME, USER, SHELL, LANG, LC_ALL, TZ, TMPDIR, TEMP, TMP) - Tier2: Network/TLS vars (SSL_CERT_FILE, SSL_CERT_DIR, REQUESTS_CA_BUNDLE, CURL_CA_BUNDLE, NODE_EXTRA_CA_CERTS) - Support extras, prefixes, denylist, allow_denied_if_explicit - Block HTTP_PROXY/HTTPS_PROXY/http_proxy/https_proxy/NO_PROXY/no_proxy by default (httpoxy mitigation) - Add comprehensive unit tests with 12 test cases - Update all 4 integration points (ProxyServer, Discoverer, DynamicProxyServer, DynamicWrapper) - Add validation for inheritance config - Default mode: tier1 (security-first) Security improvements: - Prevents accidental secret leakage to upstream MCP servers - Controlled inheritance with auditable configuration - Per-server and proxy-level defaults - Explicit override support Test coverage: - 12 new unit tests for envbuilder (all passing) - 9 existing tests maintained (zero regressions) - Manual testing verified tier1, tier1+tier2, and mode=all scenarios - HTTP_PROXY blocking and allow_denied_if_explicit override tested Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…ance When DynamicWrapper creates ProxyServer, it pre-assigns the baseServer instance (which has management tools registered) to proxyServer.mcpServer before calling Initialize(). Previously, ProxyServer.Initialize() unconditionally created a new mcpServer instance, causing static server tools to be registered to an orphaned server that never handled requests. This resulted in only management tools being exposed (5 tools) while static config server tools (26 drmm_* tools) were invisible. Fix: Only create a new mcpServer if one doesn't already exist. This ensures that static tools are registered to the same server instance that handles incoming requests. Result: All 31 tools now properly exposed (5 management + 26 static server). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove trailing whitespace in main.go - Add clarifying comment about mark3labs/mcp-go stdio compatibility - Remove mcp-debug binary from root (builds to ./bin/mcp-debug instead) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add ENV_INHERITANCE.md with complete guide (31 KB) - Update README.md with environment inheritance section - Add 4 example configs (basic, tier2, advanced, none) - Add 4 test fixtures for inheritance testing - Document tier system, modes, security rationale, troubleshooting Documentation marked as DRAFT pending broader real-world testing. Successfully tested with Datto RMM MCP server: - Tier1 vars inherited (PATH, HOME, USER, SHELL, locale) - DATTO_* prefix matching working - Secrets properly blocked (AWS keys, etc.) - All 31 tools exposed and functional Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add CLAUDE.md with project context and development guidelines for AI assistants working on the mcp-debug codebase. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Changed json.MarshalIndent() to json.Marshal() in EnableRecording()
to fix critical playback incompatibility. The playback parser expects
line-delimited JSON (JSONL), but MarshalIndent creates multi-line
pretty-printed JSON that breaks parsing.
The header was being written as:
{
"start_time": "...",
"server_info": "..."
}
But the parser reads line-by-line, so each line failed JSON parsing
and the entire header was silently ignored.
Now writes single-line JSON as expected:
{"start_time":"...","server_info":"...","messages":[]}
Verified:
- Recording file creates valid JSONL format
- Playback parser successfully loads sessions
- Header fields are correctly parsed
Fixes: Recording → playback pipeline now functional
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Static servers from config.yaml were using proxy.CreateProxyHandler which had no recording capability. Only dynamically added servers (via server_add) used DynamicWrapper.createDynamicProxyHandler with recording. Changes: - Add RecorderFunc type and optional recorder parameter to proxy.CreateProxyHandler - Add recording calls for requests, responses, and errors in CreateProxyHandler - Add recorderFunc field to ProxyServer - Inject DynamicWrapper.recordMessage into ProxyServer when recording enabled - Update CreateHandlerForTool to accept optional recorder This fixes the critical bug where recording files were created but captured zero messages when using static servers. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Added detailed documentation for session recording feature in proxy mode: New File: - docs/RECORDING.md - Complete guide covering: - Recording format (JSONL structure) - What gets recorded (static/dynamic servers, management tools, errors) - Playback modes (client/server) - Common workflows (debugging, regression testing, documentation) - Tips & best practices (filtering, sensitive data, file management) - Troubleshooting guide - Current limitations and future enhancements Updated: - README.md - Added link to recording documentation - README.md - Clarified what gets recorded (all servers, management tools) The recording feature is now fully functional and documented for: - Static servers (from config.yaml) - Dynamic servers (via server_add) - All management operations - Request/response pairs with timestamps - Error responses Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix: Enable Recording for Proxy Mode (Static & Dynamic Servers)
Corrected documentation to match the more secure implementation: - Clarified that mode "none" means no inheritance (only explicit env: values) - Clarified that mode "all" only inherits Tier 1 + Tier 2 (not all parent vars) - Emphasized Tier 1 as guaranteed baseline (always inherited unless explicitly denied) - Updated mode descriptions and examples throughout - Corrected troubleshooting section to reflect actual behavior - Added security benefits of the actual implementation - Documentation now accurately represents the implemented security model Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
docs: correct ENV_INHERITANCE documentation to match implementation
…ation
Allows static servers (from config.yaml) to be managed dynamically using
server_disconnect/server_reconnect while preserving all configuration:
- Environment variables (including sensitive credentials)
- Inheritance settings (tier1/tier2 modes)
- Timeout configuration
- All other config.yaml settings
Key changes:
- populateStaticServers() adds static servers to dynamicServers map on init
- server_reconnect command parameter now optional
- When command omitted, uses stored ServerConfig (preserves env vars)
- When command provided, creates new config (existing behavior)
Use case: Hot-swap server binaries during development without exposing
secrets to MCP client. Particularly useful for servers with API keys
and credentials in config.yaml.
Example workflow:
server_disconnect: {name: "my-server"}
server_reconnect: {name: "my-server"} # Uses stored config with secrets
⚠️ NEEDS TESTING: Implementation complete but requires validation with
real-world MCP servers (e.g., Datto RMM with API credentials).
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…nagement Enable dynamic management of static servers with config preservation
When recording is enabled, all tool responses now include metadata showing the recording file path and purpose. This helps users understand that their interactions are being logged. Metadata format: 📹 Recording: session.jsonl Full path: /path/to/session.jsonl Purpose: JSON-RPC message log for debugging and playback testing Changes: - Add recordFilename field to DynamicWrapper to store recording path - Create addRecordingMetadata() helper to inject metadata into responses - Update all tool response injection points (static, dynamic, management) - Add metadataFunc parameter to CreateProxyHandler for static tools - Update documentation in RECORDING.md and README.md The metadata is: - MCP protocol compliant (uses standard ContentItem array) - Only active when --record flag is used - Thread-safe with existing recording mutex - Includes both relative and absolute paths Tested with datto-rmm MCP server - all 8 tool types working correctly with metadata appearing in responses and properly recorded to JSONL. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…cement Add recording metadata to tool responses
This commit fixes three critical bugs introduced by recent feature merges: 1. Static Server Registration Failure - populateStaticServers() now tracks failed servers instead of silently skipping them - Failed servers are added to dynamicServers with IsConnected=false - Enables server_reconnect to work for never-connected servers - Stores discovery error messages for debugging Files: integration/proxy_server.go, integration/dynamic_wrapper.go 2. Stdio Communication Race Conditions - Added requestMu mutex to serialize all I/O operations in StdioClient - Removed async goroutine in sendRequest() to prevent concurrent pipe access - Implemented copy-on-use pattern in createDynamicProxyHandler to prevent use-after-free - Fixes "write |1: broken pipe" errors during concurrent tool calls Files: client/stdio_client.go, integration/dynamic_wrapper.go 3. Recording Metadata Content Mutation - Implemented copy-on-write in addRecordingMetadata() to avoid mutating input - Creates new CallToolResult instead of appending to existing Content slice - Prevents protocol violations from shared slice mutations File: integration/dynamic_wrapper.go All fixes are backward compatible and maintain existing functionality. Tests pass with race detector enabled. Fixes issues reported in user's Datto RMM server workflow. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The sendRequest() method was checking c.connected while holding requestMu, but the connected field is protected by c.mu (used in Connect() and Close()). This caused a data race where sendRequest could read stale values of the connected field. Fix: Read connected field while holding c.mu, then proceed with requestMu for I/O operations. This resolves the issue where reconnected servers would fail with 'client not connected' errors despite successful reconnection. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…ferences This commit fixes three critical bugs in the server_disconnect/server_reconnect workflow: 1. Stale Client Removal (CRITICAL) - server_disconnect now removes the closed client from w.proxyServer.clients list - Previously, dead clients remained in the list and could be found by populateStaticServers() - This caused "client not connected" errors after reconnect File: integration/dynamic_wrapper.go (lines 550-561) 2. Mutex Protection for Client List Access (CRITICAL) - server_reconnect now acquires w.proxyServer.mu before accessing w.proxyServer.clients - Handles case where client was removed by disconnect (appends if not found) - Prevents data race on slice access between reconnect and other operations File: integration/dynamic_wrapper.go (lines 703-722) 3. Atomic State Transition (HIGH) - Deferred setting serverInfo.IsConnected=true until AFTER all state updates complete - Client list, registry, and all internal state updated before marking as connected - Eliminates race window where IsConnected=true but proxy doesn't know about new client File: integration/dynamic_wrapper.go (lines 700, 747-748) Root Cause: The disconnect handler was not cleaning up the proxy server's client list, leaving stale references that would be found and used after reconnect, causing tool calls to fail with "client not connected" errors. Testing: - Build successful - All tests pass with race detector (go test -race ./...) - No data races detected This resolves the issue where server_reconnect succeeds but subsequent tool calls fail because they use the old dead client instead of the new connected client. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…esses This commit fixes remaining race conditions in StdioClient where the connected flag was being read without proper mutex protection. Bugs Fixed: 1. CallTool() was checking c.connected without holding c.mu 2. Initialize() was checking c.connected without holding c.mu 3. ListTools() was checking c.connected without holding c.mu These created race conditions where methods could read stale values of the connected flag, causing "client not connected" errors even when the client was actually connected. Changes: - Added mutex-protected reads of connected flag in CallTool, Initialize, ListTools - Added debug logging to track connected flag state transitions: * Connect() logs when connected=true is set * CallTool() logs the connected state when called * Close() logs when connected=false is set - Added "log" import to client package Pattern Used: All methods now use the same pattern as sendRequest(): ```go c.mu.Lock() connected := c.connected c.mu.Unlock() ``` This ensures atomic reads of the connected flag without holding the lock during expensive operations. Testing: - Build successful - Race detector clean (go test -race ./...) - Debug logging added for troubleshooting Created docs/debugging-plan.md documenting the investigation and fixes. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
After implementing mutex protection and debug logging (commit 81c9f04), user testing revealed the issue persists. Debug logs provided critical evidence: 04:33:54 - Connect() SUCCESS: datto-rmm - connected=true (NEW CLIENT) 04:34:57 - CallTool(): connected=false (OLD CLIENT) This proves tool calls use the OLD client, not the NEW one after reconnect. Root Cause Identified: - Static servers use CreateProxyHandler() which captures client in closure - Closures are immutable - reconnect creates new client but handlers use old one - Dynamic servers use createDynamicProxyHandler() which looks up current client - Dynamic pattern works because it doesn't capture client reference Solution: Make static servers use the dynamic handler pattern to eliminate closure capture. Next commit will implement this fix. Updated debugging-plan.md to document this investigation and findings. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This fixes the root cause of tool calls failing after server_reconnect. Problem: Static servers from config.yaml used CreateProxyHandler() which captured the client reference in a closure at initialization time. When server_reconnect created a new client instance, the old handler closures still referenced the disconnected client, causing "client not connected" errors. Debug evidence showed tool calls using OLD client (connected=false) instead of NEW client (connected=true) after reconnect. Solution: Make all servers (static and dynamic) use the dynamic handler pattern that looks up the current client at call time instead of capturing it in a closure. Changes: 1. integration/proxy_server.go (lines 102-115): - Removed static handler creation with CreateProxyHandler() - Tools still registered in registry (needed for discovery) - Note added that handlers will be created by DynamicWrapper 2. integration/dynamic_wrapper.go: - Added createHandlersForAllTools() method - Called after populateStaticServers() in Initialize() - Creates dynamic handlers for ALL tools using createDynamicProxyHandler() - Dynamic handlers look up current client from dynamicServers map at call time Pattern: ```go // OLD (broken): Client captured in closure at init time handler := proxy.CreateProxyHandler(mcpClient, ...) // NEW (working): Client looked up at call time handler := w.createDynamicProxyHandler(serverName, toolName) -> looks up w.dynamicServers[serverName].Client every call ``` Benefits: - Static servers can now be hot-swapped like dynamic servers - server_reconnect updates dynamicServers map with new client - All handlers automatically use the new client on next call - No need to re-register handlers or update closures Testing: - Build successful - Race detector clean (go test -race ./...) - Ready for disconnect/reconnect testing This completes the fix designed in plan mode. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…tration Fix: Enable hot-swapping for all servers by unifying handler pattern
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Selective Environment Variable Inheritance
📋 Summary
Implements a security-first tier-based environment variable inheritance system that ensures MCP servers receive necessary system variables while preventing accidental credential leakage.
Problem Solved
Previously, MCP servers started by mcp-debug did not inherit any environment variables from the parent process. This caused servers to fail because they lacked critical system variables like:
Simply inheriting all variables would fix functionality but create security risks by exposing sensitive credentials (AWS_ACCESS_KEY_ID, GITHUB_TOKEN, SSH_AUTH_SOCK, etc.) to untrusted or experimental MCP servers.
Solution: Tier-Based Inheritance
Tier 1 (Baseline) - Always inherited by default
Tier 2 (Network/TLS) - Opt-in via mode
Security Model
Files Changed
- Environment inheritance implementation in proxy core
- Configuration schema updates
---DRAFT_ENV_INHERITANCE.md- Comprehensive documentationSelective Environment Variable Inheritance
📋 Summary
Implements a security-first tier-based environment variable inheritance system that ensures MCP servers receive necessary system variables while preventing accidental credential leakage.
Problem Solved
Previously, MCP servers started by mcp-debug did not inherit any environment variables from the parent process. This caused servers to fail because they lacked critical system variables like:
Simply inheriting all variables would fix functionality but create security risks by exposing sensitive credentials (AWS_ACCESS_KEY_ID, GITHUB_TOKEN, SSH_AUTH_SOCK, etc.) to untrusted or experimental MCP servers.
Solution: Tier-Based Inheritance
Tier 1 (Baseline) - Always inherited by default
Tier 2 (Network/TLS) - Opt-in via mode
Security Model
noneenv:entries)tier1tier1+tier2allKey security principle: Variables beyond the defined tiers must be explicitly requested via
extraorprefix. There is no mode that inherits all parent environment variables.Additional Controls
env:entriesConfiguration Examples
Commits
Files Changed
DRAFT_ENV_INHERITANCE.md- Comprehensive documentation