Skip to content

Conversation

@jakenuts
Copy link
Owner

@jakenuts jakenuts commented Nov 14, 2025

Summary

  • This PR refactors Codex MCP integration and runtime orchestration inside Happy CLI. It introduces a race-free WebSocket session startup, a Windows console window hiding workaround for Codex MCP, a revamped Codex MCP integration, a permission RPC bridge, and end-to-end encryption with per-message variants. It also adds installation and architecture documentation (INSTALL.md and PROJECT.md), updates README to reflect the Next fork, renames CLI wrappers to the Next fork naming, and introduces CI/CD workflows to build dist artifacts. It also adjusts bin scripts and README to reflect the preview/Next fork.
  • It now ships with PROJECT.md and INSTALL.md, and README updates to emphasize the Next fork (happy-next commands).

Highlights:

  • Fixes race conditions in WebSocket initialization
  • Adds a Windows console hide workaround for Codex MCP subprocess
  • Improves Codex MCP session management and runtime orchestration
  • Bridges Codex permission requests to the mobile app via RPC
  • Introduces end-to-end encryption with per-message variants
  • Documents system architecture and communication flows (PROJECT.md)
  • README updated to reflect Next fork naming and usage (happy-next)
  • New bin scripts renamed to happy-next.* and corresponding README/bin references updated
  • CI/CD workflows added to build dist artifacts (build-dist.yml, ci.yml, sync-upstream.yml)
  • INSTALL.md and PROJECT.md added for quick install and architecture details

Changes

Architecture & Documentation

  • Added PROJECT.md: Technical Architecture Documentation for Happy CLI, detailing system overview, agent integration, communication flow, and critical components.
  • Added INSTALL.md: Quick Installation Guide for Happy CLI, including Windows install steps and side-by-side usage notes.
  • Updated README.md to reflect the Next fork (Happy CLI Next) and the new commands (happy-next, happy-next-mcp).
  • bin scripts renamed to match Next fork aliases: bin/happy-next.mjs and bin/happy-next-mcp.mjs (replacing previous happy.mjs and happy-mcp.mjs).

WebSocket Session (Client)

  • /src/api/apiSession.ts
    • Implement lazy, race-free connection via ensureConnected() and onUserMessage() callback registration.
    • Ensure callback is registered before the connection is established to prevent lost first messages.
    • Added end-to-end encryption hooks and RPC registration for bidirectional comms (as part of the MCP integration).

Codex MCP Integration

  • /src/codex/codexMcpClient.ts
    • Implement Windows console window hiding workaround by adjusting MCP transport initialization.
    • Add session tracking and IPC with Codex MCP subprocess via stdio transport.
    • Prepare for improved health monitoring and reconnection paths (see Pending section).

Codex Runtime Orchestrator

  • /src/codex/runCodex.ts
    • Introduce streamlined MessageQueue2 and MessageBuffer management.
    • Tighten the message flow: onUserMessage -> queue -> processing loop -> Codex MCP -> events -> buffer -> server.
    • Implement resume/restart semantics around session lifecycle to support crash recovery.

Permission Handling

  • /src/codex/utils/permissionHandler.ts
    • Implement tool-permission RPC bridge to mobile app (approval/denial flow).
    • Note: current implementation lacks timeout for permission requests (to be addressed in Future Improvements).

Encryption

  • /src/api/encryption.ts
    • Add end-to-end encryption with variants: legacy (session key) and dataKey (per-message key derived from base key + nonce).
    • Encrypt outgoing messages before socket emission; decrypt inbound messages in the update handler.

Code Quality & Testing

  • Updated tests (Vitest) and added documentation references where applicable.
  • Updated package-lock to reflect new and updated dependencies.

Documentation & Readme Enhancements

  • README.md updated to highlight Next fork usage (happy-next) and how to install/run the preview fork.
  • INSTALL.md and PROJECT.md added as part of architecture/docs improvements.

CI / Dist Automation

  • Added GitHub workflows for building and committing dist artifacts and CI validation (build-dist.yml, ci.yml).
  • Added workflow to sync upstream changes (sync-upstream.yml).

Dist & Binaries (Build Artifacts)

  • dist/* artifacts added/updated to reflect new MCP integration, encryption, and runtime changes.
  • package-lock.json and related dist artifacts updated accordingly.

Issue Fixes & Status

Testing & Validation

  • Unit tests (Vitest) cover encryption, session management, and permission flow
  • End-to-end test scenario validating race-free WebSocket startup
  • Windows environment sanity check for MCP window hiding (where applicable)

Migration & Usage

  • No breaking API changes expected for existing CLI usage. The improvements are internal to Codex MCP integration and WebSocket session handling.
  • To run tests: npm test

Build & Documentation

  • New PROJECT.md documenting architecture, data flows, and critical components.
  • New INSTALL.md documenting installation steps and Windows-specific setup.
  • README.md updated for Next fork usage; bin scripts renamed; dist artifacts updated.
  • Build artefacts updated via package-lock.json changes to reflect new dependencies.

CI / Dist Automation

  • Added GitHub workflows for building and committing dist artifacts and CI validation (build-dist.yml, ci.yml).
  • Added GitHub workflow to sync upstream changes (sync-upstream.yml).

Next Steps / Backlog

References & Metadata

🌿 Generated by Terry

jakenuts and others added 6 commits November 14, 2025 19:35
On Windows, the MCP SDK's StdioClientTransport only sets windowsHide: true
when running in Electron environments. This causes visible CMD windows to
appear every time Happy CLI interacts with the Codex subprocess when running
as a daemon or background process.

This commit implements a workaround by temporarily setting process.type
during transport initialization, which tricks the MCP SDK into enabling
windowsHide on Windows platforms.

Root cause: @modelcontextprotocol/sdk only enables windowsHide when
isElectron() returns true, which checks for process.type property.

TODO: Submit PR to @modelcontextprotocol/sdk to always hide windows on
Windows platforms, not just in Electron.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The ApiSessionClient was connecting to the WebSocket immediately in the
constructor, before message handlers were registered. This created a race
condition where messages could arrive before onUserMessage() callback was
set, causing the first user message to be missed or delayed.

Changes:
- Removed auto-connect from constructor
- Added ensureConnected() method that returns a Promise for connection
- Updated onUserMessage() to await connection before returning
- Made sendSessionEvent() async to ensure connection is established
- Updated all callers of sendSessionEvent() to handle async calls

This ensures:
1. Message callback is registered BEFORE WebSocket connects
2. Connection is established and verified before messages flow
3. pendingMessages queue is properly drained in correct order
4. 10-second timeout prevents infinite hangs on connection failures

Fixes the primary cause of missed first messages in Codex agent integration.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Created PROJECT.md with detailed technical overview of Happy CLI
architecture, focusing on agent integration and communication flow.

Contents:
- System overview and component architecture
- Detailed agent communication flow diagrams
- Critical component documentation (apiSession, codexMcpClient, runCodex)
- Analysis of 5 identified issues with status and fixes
- Windows console window issue (fixed)
- WebSocket race condition (fixed)
- Agent disconnection/offline (pending)
- Message queue timing (pending)
- Missing timeouts (pending)
- Code patterns, testing, and debugging tips

This documentation is optimized for LLM consumption with concise,
technical language and minimal prose to preserve context space.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
Implements comprehensive health monitoring for the Codex subprocess to
detect crashes and automatically reconnect with exponential backoff.

Changes to CodexMcpClient:
- Added isProcessAlive() method to check subprocess status
- Added setProcessExitHandler() for exit event notifications
- Added setProcessErrorHandler() for error event notifications
- Hooked transport.onerror and transport.onclose for monitoring
- Logs subprocess PID on successful connection

Changes to runCodex:
- Track subprocess health state (processUnexpectedlyExited)
- Implement ensureCodexConnection() with exponential backoff (1s, 2s, 4s)
- Auto-reconnect on subprocess death (max 3 attempts)
- Clear session state after reconnection to force fresh session
- Check connection health before processing each message
- User-friendly error message when reconnection fails

This prevents the "agent going offline" issue where Codex subprocess
crashes silently and never recovers. The agent will now automatically
reconnect and continue serving requests.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Reduces excessive timeouts and prevents indefinite hangs in Codex operations.

Changes to codexMcpClient:
- Reduced DEFAULT_TIMEOUT from 14 days to 5 minutes
- Previous 14-day timeout could cause indefinite hangs
- 5 minutes is reasonable for most Codex operations

Changes to permissionHandler:
- Added PERMISSION_TIMEOUT constant (2 minutes)
- Implemented timeout mechanism for permission requests
- Added timeoutId tracking in PendingRequest interface
- Timeout handler rejects promise and updates agent state
- Clear timeout when permission response received
- Clear all timeouts on reset() to prevent leaks
- Timeout requests marked as 'canceled' with reason

Impact:
- Prevents permission requests hanging forever if mobile doesn't respond
- Prevents MCP operations hanging for weeks
- Provides clear timeout error messages
- Properly cleans up timers to prevent memory leaks

This resolves Issue slopus#5: Missing timeouts causing operations to hang
indefinitely when mobile app is unresponsive or MCP operations fail.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@jakenuts jakenuts changed the title Improve Codex MCP integration: race fix + Windows workaround Refactor Codex MCP integration: race fix, IPC, Windows workaround Nov 14, 2025
This preview release includes all Codex integration improvements:
- Fixed Windows console window issue
- Fixed WebSocket race condition (missed first messages)
- Added subprocess health monitoring with auto-reconnection
- Added timeout handling for MCP operations and permissions
- Comprehensive technical documentation in PROJECT.md

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@jakenuts jakenuts changed the title Refactor Codex MCP integration: race fix, IPC, Windows workaround Codex MCP integration overhaul: race fix, IPC, Windows, encryption Nov 14, 2025
jakenuts and others added 9 commits November 15, 2025 15:11
The postinstall script (unpack-tools.cjs) was failing during npm install
from GitHub because the 'tar' dependency wasn't available yet during the
git package preparation phase.

Changes:
- Lazy load tar module with try/catch
- Skip unpacking gracefully when tar not available
- Add helpful message about deferred unpacking
- Exit with code 0 (success) when skipping due to missing tar
- Tools will unpack on first use or manual script run

This fixes the error:
  Error: Cannot find module 'tar'
  npm error command failed
  npm error command sh -c node scripts/unpack-tools.cjs

Users can now install directly from GitHub without errors.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Includes fix for npm install from GitHub (postinstall script issue).

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Changed postinstall to use inline node -e with try/catch instead of
directly calling the script file. This handles cases where:
- scripts directory is not included in npm package from GitHub
- unpack-tools.cjs is missing or inaccessible
- Dependencies haven't been installed yet

The script will now fail gracefully with a helpful message and allow
installation to succeed. Tools will unpack on first use.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
When installing from GitHub, npm runs the 'prepare' script but not
'prepublishOnly'. This ensures the dist directory is built during
installation so the bin scripts can execute properly.

Fixes error:
  Error: Cannot find module '.../node_modules/happy-coder/bin/happy.mjs'

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…scripts and docs

- Introduce 'package-next.json' for preview version '0.11.3-preview.4', enabling side-by-side installation with stable 'happy-coder'.
- Add 'scripts/publish-next.sh' to automate publishing 'happy-next' under npm preview tag.
- Add test scripts 'scripts/test-install.sh' and 'scripts/test-installed.sh' to validate installations.
- Add comprehensive 'scripts/README.md' with usage docs, workflows, and troubleshooting for dual packages.
- Update dependencies and lockfiles to support preview package setup.

This enables users to install and test preview features safely alongside stable releases.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
- Add GitHub Actions workflow to automatically fetch, merge, and push changes from upstream slopus/happy-cli repository daily at 2 AM UTC
- Implement conflict detection and create GitHub issues for manual intervention when merge conflicts occur
- Add sync-upstream.sh script for manual syncing with dry-run, merge, and push support
- Include documentation for installation and syncing process in INSTALL.md and scripts/README.md

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
- Change package name from happy-coder to happy-next
- Update commands to happy-next and happy-next-mcp
- Update repository references to jakenuts/happy-cli fork
- Add comprehensive installation guide with side-by-side setup
- Allow stable (happy) and preview (happy-next) to coexist

This enables users to test the preview fork without affecting their
stable installation from the upstream repo.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Clarify this is a preview fork with experimental features
- Update all command examples to use happy-next
- Add 'What's New' section highlighting fork improvements
- Include side-by-side installation instructions
- Link to comprehensive documentation files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Replace shx dependency with native fs.rmSync for dist cleanup
- Create scripts/build.mjs that works without devDependencies
- Fix Windows install from GitHub (shx not available during prepare)
- Add build:dev script for development (still uses shx)
- Update all scripts to use npm run instead of yarn

This fixes the 'shx is not recognized' error on Windows when
installing from GitHub, as the prepare script now uses only
built-in Node.js APIs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@jakenuts jakenuts changed the title Codex MCP integration overhaul: race fix, IPC, Windows, encryption Codex MCP integration overhaul: race fix, IPC, Windows, encryption, docs Nov 17, 2025
jakenuts and others added 6 commits November 17, 2025 20:32
- Detect if typescript is available in node_modules
- Skip type check during prepare phase (GitHub install)
- Only run pkgroll build during GitHub install
- Full type check + build for local development

Fixes 'tsc: not found' error during GitHub install.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Remove dist from gitignore to enable GitHub installs
- Commit built artifacts so prepare script can skip build
- Eliminates need for devDependencies (typescript, pkgroll) during install

This allows users to install directly from GitHub without build failures.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Check if dist folder exists before attempting build
- Exit early if dist exists and typescript not available
- Prevents pkgroll/typescript dependency issues on GitHub install

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Added workflows:
- build-dist.yml: Automatically builds and commits dist folder on push
  - Runs on main and feature branches
  - Skips if commit message contains [skip-build]
  - Prevents infinite loops
  - Only commits if dist changes

- ci.yml: Cross-platform build and test validation
  - Tests on Ubuntu, Windows, macOS
  - Tests Node 20 and 22
  - Validates build, type checking, and installation
  - Runs tests on Linux

Benefits:
- Users installing from GitHub always get latest built dist
- No more 'pkgroll not found' or 'tsc not found' errors
- Validates builds work across all platforms
- Catches build issues before they reach users

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@jakenuts jakenuts changed the title Codex MCP integration overhaul: race fix, IPC, Windows, encryption, docs Codex MCP integration and runtime overhaul (docs, IPC, encryption) Nov 17, 2025
- Renamed bin/happy.mjs → bin/happy-next.mjs
- Renamed bin/happy-mcp.mjs → bin/happy-next-mcp.mjs
- Updated package.json and package-next.json bin references

Fixes 'Cannot find module happy.mjs' error on Windows install.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@jakenuts jakenuts changed the title Codex MCP integration and runtime overhaul (docs, IPC, encryption) Codex MCP integration, runtime overhaul, docs, and Next fork CI Nov 17, 2025
jakenuts and others added 2 commits November 17, 2025 21:55
- Commented out daily cron schedule
- Keep workflow_dispatch for manual runs
- Use ./scripts/sync-upstream.sh for on-demand sync

The automatic sync was causing failures and we only need it
occasionally, not daily.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
jakenuts and others added 11 commits November 17, 2025 22:42
Added two new command scripts, happy-next.cmd and happy-next-mcp.cmd, enabling execution of corresponding .mjs files via Node.js on Windows environments with or without a local node.exe present.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
Updated references to renamed files in dist directory for next build iteration. Files include index, runCodex, types, and lib modules with updated hashed suffixes to maintain cache busting and consistency.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
Added workflows:
- publish-npm.yml: Automatically publishes to npm when version changes
  - Triggers on package.json version changes in main branch
  - Can be manually triggered via workflow_dispatch
  - Creates GitHub releases automatically
  - Uses NPM_TOKEN secret for authentication

- bump-version.yml: Manual version bumping workflow
  - Supports prerelease, patch, minor, major bumps
  - Automatically builds and commits dist
  - Creates and pushes git tags
  - Updates latest-preview tag for prereleases

Updated documentation:
- README.md: Changed install command to npm registry
- INSTALL.md: Added npm install instructions
- package-next.json: Synced version to 0.11.3-preview.7

Now users can simply run: npm install -g happy-next
No more GitHub install issues or NVM path problems!

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Publishes to npm on every push to preview branch
- Uses timestamp + commit hash for unique versions
- Tagged as 'preview' for easy install with npm i -g happy-next@preview
- No more manual publishing needed!
- Explains automatic preview publishing on every push
- Documents that no manual steps are needed
- Updates workflow order to highlight auto-publish first
- Update bin file checks from happy.mjs to happy-next.mjs
- Update bin file checks from happy-mcp.mjs to happy-next-mcp.mjs
- Fixes failing CI checks on PR
@jakenuts jakenuts merged commit e306f2e into main Nov 18, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant