Skip to content

Comments

Startup speed, event filtering, log hygiene & reliability#272

Merged
joelteply merged 6 commits intomainfrom
feature/startup-speed-event-filtering
Feb 19, 2026
Merged

Startup speed, event filtering, log hygiene & reliability#272
joelteply merged 6 commits intomainfrom
feature/startup-speed-event-filtering

Conversation

@joelteply
Copy link
Contributor

Summary

  • LogBatcher: Batch log IPC calls (100x reduction, ~1000/sec → ~10/sec), eliminating socket contention that starved ORM
  • UserEntityCache: In-memory TTL cache for UserEntity reads, eliminating 515 ORM timeouts per startup
  • Event filtering: DOM interest-based event filtering in browser, rate limiter thresholds tuned for 18 concurrent personas
  • Health check backoff: Exponential backoff for failing providers (5min ceiling), log throttling for insufficient_funds
  • Log hygiene: Removed 45+ debug console.error/log calls, raised ORM slow thresholds, eliminated 87% of npm-start.log noise (7596 → 936 lines), cleaned browser console EntityScroller spam
  • ORM race guards: try/catch for tasks that vanish between dequeue and completion
  • Memory leak fixes: ResponseCorrelator cleanup, widget disconnect handlers

Metrics (before → after)

Metric Before After
Rust worker write failures 1,177 0
Event cascade blocked 209 0
ORM request timeouts 515 0
ORM entity not found 2 0
npm-start.log lines (startup) 7,596 936
Health check log spam 118+ lines ~5 lines
Steady-state log growth continuous 0 lines/30s

Test plan

  • TypeScript compiles clean
  • Rust compiles clean
  • npm start deploys successfully
  • ./jtag ping confirms server + browser connected
  • Cloud AIs responding (Groq, DeepSeek, Fireworks, Together)
  • Local AIs responding (Helper, Teacher, CodeReview, Local Assistant)
  • npm-start.log verified clean (zero debug spam, zero internal errors)
  • Browser console verified clean (EntityScroller spam eliminated)
  • Health check backoff verified (exponential intervals, recovery capability)

- Parallelize persona initialization: batched 6-concurrent (18-36s → 3-6s)
- Parallelize PersonaUser.initialize(): genome sync, rate limiter, corpus load run concurrently
- EventsDaemonBrowser: filter-first DOM dispatch (only events with registered interest)
- Events.subscribe(): registers DOM interest automatically in browser
- BaseWidget/BaseContentWidget: register DOM interest for event dispatchers
- WidgetEventServiceBrowser: track and cleanup document listeners on disconnect
- ResponseCorrelator: periodic cleanup timer (60s sweep) + destroy() on disconnect
- JTAGClient.disconnect(): calls correlator.destroy() to stop orphaned timers
- UserEntityCache: in-memory TTL cache for UserEntity reads, 60s expiry
- CallerDetector: use cached reads instead of raw ORM.read per command
- SessionDaemonServer: getUserById uses cached UserEntity reads
- UserDaemonServer: monitoring loop 5s→30s, populates cache on query
- Combined with Round 1 (LogBatcher, EventRateLimiter, ORM race guards),
  total errors eliminated: ~2,200 → 0 per session
AdapterHealthMonitor: per-adapter backoff (base × 2^failures, 5min ceiling)
so unhealthy providers don't spam checks every 30s but still recover.
BaseOpenAICompatibleAdapter: throttle insufficient_funds/rate_limited logs
to once per 5 minutes per status instead of every API call.
- Remove 45+ debug console.error calls left from session identity investigation
- Raise ORM SLOW thresholds 50ms/100ms → 1000ms (5689 → 192 lines)
- Exclude fire-and-forget log/write-batch from TimingHarness slow logging
- Remove per-init console.logs: DatabaseHandleRegistry, ResourceManager,
  AIAudioInjector TTS subscribe, AIGenerateServerCommand
- Steady-state log growth: continuous → 0 lines/30s
Every entity update event was logging two console.log calls per entity,
flooding the browser console with update notifications.
Copilot AI review requested due to automatic review settings February 19, 2026 04:52
@joelteply joelteply merged commit 570ef34 into main Feb 19, 2026
4 of 7 checks passed
@joelteply joelteply deleted the feature/startup-speed-event-filtering branch February 19, 2026 04:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets faster startup and improved runtime reliability by reducing IPC/log overhead, caching high-frequency ORM reads, and filtering/cleaning noisy browser/server events and logs.

Changes:

  • Added Rust/TS log batching (log/write-batch + LogBatcher) to reduce logger IPC load and contention.
  • Introduced an in-memory TTL UserEntityCache and pre-warming to reduce redundant ORM reads/timeouts during startup.
  • Implemented DOM “interest-based” event dispatch filtering in the browser daemon, plus assorted log hygiene, backoff/throttling, and lifecycle cleanup fixes.

Reviewed changes

Copilot reviewed 35 out of 36 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/workers/continuum-core/src/modules/logger.rs Adds log/write-batch payload/result types and handler in Rust logger worker.
src/widgets/shared/EntityScroller.ts Removes noisy console logging for entity updates.
src/widgets/shared/BaseWidget.ts Registers DOM interest for widget events before attaching document listeners.
src/widgets/shared/BaseContentWidget.ts Registers DOM interest for content events and attaches document listeners.
src/widgets/browser/services/WidgetEventServiceBrowser.ts Tracks/removes DOM listeners; registers/unregisters DOM interest for events.
src/system/voice/server/AIAudioInjector.ts Removes prototype console logging.
src/system/user/server/modules/PersonaTaskExecutor.ts Adds ORM update race guard for task deletion during execution.
src/system/user/server/modules/PersonaAutonomousLoop.ts Adds ORM update race guard for task deletion before execution.
src/system/user/server/UserEntityCache.ts New TTL cache for read-through UserEntity reads + bulk prewarm API.
src/system/user/server/PersonaUser.ts Parallelizes Rust bridge syncing and corpus load during persona init.
src/system/user/server/CallerDetector.ts Switches user reads to UserEntityCache.
src/system/resources/shared/ResourceManager.ts Removes non-actionable console logs.
src/system/core/shared/TimingHarness.ts Suppresses slow-operation console logs for log/write-batch.
src/system/core/shared/ResponseCorrelator.ts Adds periodic cleanup timer + destroy() for lifecycle shutdown.
src/system/core/shared/Events.ts Tracks DOM interest for browser subscriptions (used for browser-side dispatch filtering).
src/system/core/logging/Logger.ts Instantiates/destroys LogBatcher; hides direct worker client behind batcher.
src/system/core/logging/LogBatcher.ts New batching layer for sending many log writes in one IPC request.
src/system/core/logging/ComponentLogger.ts Routes Rust logging through LogBatcher instead of per-message IPC.
src/system/core/client/shared/JTAGClient.ts Ensures ResponseCorrelator.destroy() is called on shutdown.
src/system/core/client/server/JTAGClientServer.ts Removes noisy console logs; clarifies “no local system auto-create” behavior.
src/shared/version.ts Version bump.
src/shared/ipc/logger/LoggerWorkerClient.ts Adds writeLogsBatch() that sends log/write-batch.
src/shared/ipc/logger/LoggerMessageTypes.ts Adds batch payload/result TS types.
src/package.json Version bump.
src/package-lock.json Version bump.
src/generated-command-schemas.json Regeneration timestamp update.
src/daemons/user-daemon/server/UserDaemonServer.ts Batches persona initialization; slows monitoring interval; prewarms UserEntityCache.
src/daemons/session-daemon/server/SessionDaemonServer.ts Switches user reads to UserEntityCache; removes debug spam.
src/daemons/events-daemon/shared/EventsDaemon.ts Adjusts event rate limiter thresholds + log formatting.
src/daemons/events-daemon/browser/EventsDaemonBrowser.ts Adds DOM interest-based filtering before dispatching DOM CustomEvents.
src/daemons/data-daemon/shared/ORMLogger.ts Raises “slow ORM” warning threshold to reduce noise.
src/daemons/data-daemon/server/ORMRustClient.ts Raises “slow IPC” warning threshold to reduce noise.
src/daemons/data-daemon/server/DatabaseHandleRegistry.ts Removes console logs during handle registry initialization/registration.
src/daemons/ai-provider-daemon/shared/adapters/BaseOpenAICompatibleAdapter.ts Throttles repeated per-status error logs (e.g., insufficient funds).
src/daemons/ai-provider-daemon/server/AdapterHealthMonitor.ts Adds exponential backoff and milestone logging for failing adapter health checks.
src/commands/ai/generate/server/AIGenerateServerCommand.ts Removes debug console logs; returns result directly.
Files not reviewed (1)
  • src/package-lock.json: Language not supported
Comments suppressed due to low confidence (2)

src/widgets/shared/BaseContentWidget.ts:303

  • setupContentEventListeners() registers DOM interest and attaches three document-level listeners, but there is no corresponding removal/unregister in onWidgetCleanup(). Since BaseContentWidget instances can be connected/disconnected, this will leak listeners and keep DOM interest permanently enabled for these events. Track listener references + unregister functions and dispose them during cleanup.
    src/widgets/shared/BaseWidget.ts:801
  • setupEventDispatcher() registers DOM interest but drops the returned unregister function, and adds a document.addEventListener with an inline callback that is never removed in disconnectedCallback(). This defeats the "interest-based dispatch" goal over time (interest leaks) and can leak listeners across widget lifecycles. Store both the unregister closure and the event listener function (or use an AbortController) and clean them up in disconnectedCallback()/onWidgetCleanup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +154 to +156
} catch {
// Task was deleted between dequeue and completion — work was still done, just can't record it
this.log(`⚠️ ${this.displayName}: Task ${task.taskId.slice(0, 8)} vanished during execution (deleted externally?)`);
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catch swallows all ORM.update failures and assumes the task was deleted. ORM.update can throw for other reasons (IPC timeout, Rust error, read-after-update failure), so this can hide real persistence failures and make debugging harder. Capture the error and only suppress known "not found"/deleted-task cases; otherwise log (or rethrow) unexpected errors.

Suggested change
} catch {
// Task was deleted between dequeue and completion — work was still done, just can't record it
this.log(`⚠️ ${this.displayName}: Task ${task.taskId.slice(0, 8)} vanished during execution (deleted externally?)`);
} catch (error) {
const errorMessage =
error && typeof error === 'object' && 'message' in error && typeof (error as any).message === 'string'
? (error as any).message
: String(error);
const errorCode =
error && typeof error === 'object' && 'code' in error ? (error as any).code : undefined;
const looksLikeNotFound =
(typeof errorMessage === 'string' &&
/not\s*found/i.test(errorMessage)) ||
(typeof errorMessage === 'string' &&
/No document with id/i.test(errorMessage)) ||
errorCode === 'NOT_FOUND';
if (looksLikeNotFound) {
// Task was deleted between dequeue and completion — work was still done, just can't record it
this.log(
`⚠️ ${this.displayName}: Task ${task.taskId.slice(
0,
8
)} vanished during execution (deleted externally? ORM.update not-found: ${errorMessage})`
);
} else {
// Unexpected persistence failure — log full details to aid debugging
this.log(
`❌ ${this.displayName}: Failed to update task ${task.taskId.slice(
0,
8
)} in ORM.update: ${errorMessage}`
);
}

Copilot uses AI. Check for mistakes.
Comment on lines +141 to +144
} catch {
// Task was deleted between dequeue and execution — skip it
this.log(`⚠️ ${this.personaUser.displayName}: Task ${item.taskId.slice(0, 8)} vanished before execution`);
return;
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catch treats any ORM.update failure as "task vanished" and returns. ORM.update can fail for reasons other than deletion, so this could cause tasks to be silently skipped during transient DB/IPC issues. Capture and inspect/log the error; only skip when the error indicates a missing/deleted task, and consider retry/backoff for transient failures.

Suggested change
} catch {
// Task was deleted between dequeue and execution — skip it
this.log(`⚠️ ${this.personaUser.displayName}: Task ${item.taskId.slice(0, 8)} vanished before execution`);
return;
} catch (error: any) {
const message = error?.message ?? String(error);
const isNotFoundError =
error?.code === 'NOT_FOUND' ||
error?.name === 'NotFoundError' ||
/not\s*found|missing|deleted/i.test(message);
if (isNotFoundError) {
// Task was deleted between dequeue and execution — skip it
this.log(
`⚠️ ${this.personaUser.displayName}: Task ${item.taskId.slice(0, 8)} vanished before execution (${message})`
);
return;
}
// Non-deletion failure — log, retry once, and propagate on persistent failure
this.log(
`⚠️ ${this.personaUser.displayName}: Failed to mark task ${item.taskId.slice(
0,
8
)} as in_progress: ${message}`
);
try {
await ORM.update<TaskEntity>(
COLLECTIONS.TASKS,
item.taskId,
{ status: 'in_progress', startedAt: new Date() }
);
} catch (retryError: any) {
const retryMessage = retryError?.message ?? String(retryError);
this.log(
`❌ ${this.personaUser.displayName}: Retry to mark task ${item.taskId.slice(
0,
8
)} as in_progress failed: ${retryMessage}`
);
throw retryError;
}

Copilot uses AI. Check for mistakes.
constructor(defaultTimeoutMs: number = 30000) {
this.defaultTimeoutMs = defaultTimeoutMs;
// Periodic sweep for orphaned requests (every 60s)
this.cleanupTimer = setInterval(() => this.cleanup(), 60_000);
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResponseCorrelator starts a setInterval in the constructor. In Node, this timer can keep the process alive if shutdown() is not called (e.g., short-lived scripts/tests). Consider calling unref() on the interval handle when available, or lazily creating the interval only when there are pending requests and clearing it when the map becomes empty.

Suggested change
this.cleanupTimer = setInterval(() => this.cleanup(), 60_000);
this.cleanupTimer = setInterval(() => this.cleanup(), 60_000);
// In Node.js, prevent this interval from keeping the process alive on its own
(this.cleanupTimer as any)?.unref?.();

Copilot uses AI. Check for mistakes.
Comment on lines +614 to +618
let count = batch.entries.len();
for entry in batch.entries {
// Queue each entry through the existing channel (writer thread handles actual I/O)
let _ = self.log_tx.try_send(entry);
}
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle_write_batch() returns entriesQueued as the batch length, but the loop uses try_send() and ignores send failures. If the channel is full, some entries will be dropped while the response still reports them as queued. Track try_send() successes (and optionally dropped count) so the result reflects reality and you can observe backpressure/drops.

Copilot uses AI. Check for mistakes.
Comment on lines +603 to +625
fn handle_write_batch(&self, params: Value) -> Result<CommandResult, String> {
// Extract payload (WorkerClient nests under "payload", support both patterns)
let payload_value = if let Some(nested) = params.get("payload") {
nested.clone()
} else {
params.clone()
};

let batch: WriteLogBatchPayload =
serde_json::from_value(payload_value).map_err(|e| format!("Invalid batch payload: {e}"))?;

let count = batch.entries.len();
for entry in batch.entries {
// Queue each entry through the existing channel (writer thread handles actual I/O)
let _ = self.log_tx.try_send(entry);
}

self.requests_processed.fetch_add(1, Ordering::Relaxed);

CommandResult::json(&WriteLogBatchResult {
entries_queued: count,
})
}
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New log/write-batch behavior is not covered by tests. Since this file already has unit tests for log/ping and log/write, add a test that calls handle_command("log/write-batch", ...) and asserts it succeeds and returns a payload with entriesQueued matching the number actually accepted.

Copilot uses AI. Check for mistakes.
Comment on lines 374 to +378
document.addEventListener(patternOrEventName, eventHandler);

// Register DOM interest so EventsDaemonBrowser dispatches this event to DOM
this.domInterest.add(patternOrEventName);

Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Events.domInterest is a Set, and the unsubscribe closure always deletes the event name. If multiple callers subscribe to the same DOM event, the first unsubscribe will delete the interest and can stop EventsDaemonBrowser from dispatching that event to the DOM even though other listeners are still registered. Use a reference count (e.g., Map<string, number>) or another mechanism that only removes interest when the last listener unsubscribes.

Copilot uses AI. Check for mistakes.
Comment on lines +64 to +68
public static registerDOMInterest(eventName: string): () => void {
EventsDaemonBrowser._domInterest.add(eventName);
return () => {
EventsDaemonBrowser._domInterest.delete(eventName);
};
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

registerDOMInterest() uses a Set and the returned unregister function unconditionally deletes the event name. If multiple widgets/services register the same event, the first cleanup will delete interest for all and can prevent DOM dispatch for remaining listeners. Consider ref-counting interests (Map<string, number>) so unregister only removes when the last registrant cleans up.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant