-
Notifications
You must be signed in to change notification settings - Fork 0
Fix live Codex readiness detection #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ensure codex in-flight ends within 2.5s of completion, pick active JSONL per process, and make opencode message completion authoritative. Remove wheel passive listener warnings and align docs/tests.
Start the server on the next available port during dev and ignore stale OpenCode status events when determining in-flight state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d3a81bc1c4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| async function waitForExit(child: ReturnType<typeof spawn>, timeoutMs: number): Promise<boolean> { | ||
| try { | ||
| await Promise.race([once(child, "exit"), delay(timeoutMs)]); | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return false on timeout in waitForExit
waitForExit always resolves true because Promise.race([once(child,"exit"), delay(timeoutMs)]) never rejects; the delay path resolves after the timeout. When the child ignores SIGTERM, stop() will think it exited and skip the SIGKILL path, leaving the process running and potentially holding ports for later tests. Make the timeout path resolve false (or reject) so the SIGKILL fallback actually runs when the process doesn’t exit in time.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| env: { | ||
| ...process.env, | ||
| CONSENSUS_PORT: String(options.port), | ||
| CONSENSUS_HOST: "127.0.0.1", | ||
| CONSENSUS_CODEX_HOME: options.codexHome, | ||
| CONSENSUS_DEBUG_ACTIVITY: options.debugActivity ? "1" : "0", | ||
| CODEX_TEST_HOOKS: options.testHooks ? "1" : "0", | ||
| CONSENSUS_PROCESS_MATCH: options.processMatch, | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Live harness accidentally sets CONSENSUS_PROCESS_MATCH="undefined" when option not provided
startConsensusServer() always includes an env entry for CONSENSUS_PROCESS_MATCH, even when options.processMatch is undefined. In Node’s spawn, environment values are strings; passing undefined results in the child process receiving the literal string "undefined" (or otherwise a malformed env var), which then gets treated as a regex pattern.
Actual vs expected:
- Actual: the server may run with
CONSENSUS_PROCESS_MATCH=undefined, causingnew RegExp(matchEnv)to compile and unintentionally match processes containing the word “undefined”, or at minimum diverge from default behavior. - Expected: when
processMatchisn’t provided, the env var should be omitted so normal process detection runs.
Impact:
- Can break or skew live Codex readiness detection and PID/session matching during the new live harness tests.
Code
tests/integration/liveCodexHarness.ts:138-146
env: {
...process.env,
CONSENSUS_PORT: String(options.port),
CONSENSUS_HOST: "127.0.0.1",
CONSENSUS_CODEX_HOME: options.codexHome,
CONSENSUS_DEBUG_ACTIVITY: options.debugActivity ? "1" : "0",
CODEX_TEST_HOOKS: options.testHooks ? "1" : "0",
CONSENSUS_PROCESS_MATCH: options.processMatch,
},Recommendation: Only set CONSENSUS_PROCESS_MATCH when options.processMatch is a non-empty string (e.g., build env conditionally, or ...(options.processMatch ? { CONSENSUS_PROCESS_MATCH: options.processMatch } : {})).
Was this helpful? React with 👍 or 👎 to provide feedback.
| if (!state.inFlight) return; | ||
| if (!Number.isFinite(inflightTimeoutMs) || inflightTimeoutMs <= 0) return; | ||
| if (!Number.isFinite(inflightTimeoutMs) || inflightTimeoutMs <= 0) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Duplicate in-flight timeout guard in Codex tail expiry logic
expireInFlight() contains the exact same guard check twice.
Actual vs expected:
- Actual: redundant condition evaluation every time
expireInFlight()runs. - Expected: a single guard.
Impact:
- Minor: unnecessary code and a small maintainability hazard (future edits could accidentally change only one copy).
Code
src/codexLogs.ts:764-766
if (!state.inFlight) return;
if (!Number.isFinite(inflightTimeoutMs) || inflightTimeoutMs <= 0) return;
if (!Number.isFinite(inflightTimeoutMs) || inflightTimeoutMs <= 0) return;Recommendation: Remove the duplicated guard line.
Was this helpful? React with 👍 or 👎 to provide feedback.
|
Superseded by cherry-picked Codex tail fixes in fix/activity-state. |
What
Why
How tested
User impact
Docs
Risk
Rollback