Skip to content

Comments

Rust Sentinel Module + Provider-Scoped ModelRegistry + ModelCapabilities Type System#269

Merged
joelteply merged 35 commits intomainfrom
feature/rust-sentinel-module
Feb 17, 2026
Merged

Rust Sentinel Module + Provider-Scoped ModelRegistry + ModelCapabilities Type System#269
joelteply merged 35 commits intomainfrom
feature/rust-sentinel-module

Conversation

@joelteply
Copy link
Contributor

@joelteply joelteply commented Feb 14, 2026

Summary

  • Declarative pipeline execution engine — Sentinels execute step-based pipelines entirely in Rust with 9 step types (Shell, LLM, Command, Condition, Loop, Parallel, Emit, Watch, Sentinel)
  • Universal CommandExecutor — Routes commands to either Rust modules (direct) or TypeScript server (via Unix socket), with agentic loop support via execute_ts_json
  • Provider-scoped ModelRegistry — Fixes cross-provider context window collisions (e.g., same Llama model at 1400 on Candle vs 131072 on Together). Internal keys are now ${provider}:${modelId} with secondary index for fast unscoped lookups
  • ModelCapabilities type system — Comprehensive enums and interfaces for quantization formats (FP32→Q2_K, GPTQ, AWQ), PEFT methods (LoRA, QLoRA, DoRA, IA3), inference runtimes, hardware profiles, and adapter stacking — the "knowing" layer for algorithmic model selection

Key Components

Component Purpose
SentinelRunner Rust-native pipeline step executor (9 step types, 103 Rust tests)
CommandExecutor Universal Rust↔TypeScript command bridge via Unix socket
ModelRegistry Provider-scoped model metadata cache with normalization chain
ModelCapabilities Type system for LoRA/PEFT/quantization/hardware profiles
ModelContextWindows 7 functions now provider-aware, inference speed bug fixed

Provider-Scoped Registry Architecture

register(metadata)     →  key = "${provider}:${modelId}"
                           secondary: modelId → Set<provider>

get(modelId)           →  single provider? return it
                           multiple? return largest context window

get(modelId, provider) →  exact scoped lookup with normalization
                           (date-suffix strip, prefix match)

Bug fixed: getInferenceSpeed() now correctly returns 40 TPS for local providers (candle/ollama/sentinel) instead of assuming 1000 TPS (cloud) for any registry hit.

Consumer cleanup: Deleted 3 duplicated isLocal workaround patterns from ChatRAGBuilder, passed provider through 8 consumer files.

ModelCapabilities — The "Knowing" Layer

Defines everything needed for algorithmic model selection and LoRA genome paging:

  • QuantFormat — 14 quantization levels (FP32 → Q2_K, GPTQ, AWQ)
  • AdapterMethod — 8 PEFT techniques (LoRA, QLoRA, DoRA, IA3, prefix/prompt tuning)
  • AdapterTarget — 9 targetable transformer layers (attention Q/K/V/O, MLP, embeddings)
  • InferenceRuntime — 9 runtimes (Candle, llama.cpp, MLX, Ollama, vLLM, cloud)
  • ModelAdapterProfile — composite type combining quantization + fine-tuning + hardware
  • Query helpers: isFineTunable(), supportsLoRA(), supportsAdapterStacking(), estimateAdapterVramMB(), fitsInVram()

Test Results

  • 103 Rust pipeline tests passing
  • 58 TypeScript unit tests passing (12 provider-scoped, 21 ModelCapabilities, 25 existing)
  • Deployed and verified: Candle personas produce coherent output with correct context windows

Test plan

  • Rust commands route via ModuleRegistry (health-check)
  • TypeScript commands route via Unix socket (list, file/load)
  • Pipeline sentinel executes all 9 step types
  • Provider-scoped registry: same model on two providers returns correct context window
  • Unscoped lookup returns largest context window (backward compat)
  • getInferenceSpeed returns correct TPS for local vs cloud
  • ChatRAGBuilder workarounds removed, provider passed through
  • ModelCapabilities type system: all enums, interfaces, query helpers tested
  • Full deploy + AI persona chat verification

- Add SentinelRunner.ts with full step execution loop
- Support all step types: command, llm, condition, watch, sentinel, emit
- Implement loop control: once, count, until, while, continuous, event
- Variable substitution with $variable.path[0].property syntax
- Safety limits: maxIterations, timeoutMs
- Nested sentinel spawning with await support
- PipelineSentinelDefinition type with proper discriminator
- Standalone test suite validating engine mechanics
- Rust SentinelModule for process isolation (from previous work)
Features:
- LLM tool calling: parse ```tool JSON blocks, execute via Commands
- Parallel steps: concurrent execution with failFast option
- Event triggers: SentinelTriggerManager with debounce/throttle
- ParallelStep type for concurrent nested step execution

Fixes:
- Success determination: use startsWith('Completed') not includes
  (error messages containing 'completed' were incorrectly marked success)

Tests:
- Olympics validation: build-fix-loop and PR review patterns
- Structure validation for complex multi-step sentinels
- Condition branching and loop control validation
- Add 'pipeline' sentinel type to sentinel/run command
- Add PipelineSentinelParams interface for JSON pipeline definitions
- Wire SentinelRunner into sentinel command system
- Fix WorkerClient to include 'command' field for Rust IPC compatibility
- Fix LoggerModule to extract nested 'payload' field from WorkerClient
- Replace console.log in SentinelRunner with proper Logger system

The WorkerClient was sending 'type' field but Rust IPC expected 'command'.
The LoggerModule was expecting flat params but WorkerClient nested data
under 'payload'. Both patterns are now supported.

Live tested with multi-step pipelines via ./jtag sentinel/run --type=pipeline
…ucture

Pipeline execution now runs entirely in Rust SentinelModule:
- Shell steps with /bin/sh -c for commands with spaces
- LLM steps route through AIProviderModule
- Command steps route through ModuleRegistry
- Condition/loop steps with variable interpolation
- Step results written to steps.jsonl

Logging infrastructure properly integrated:
- Module logs: .continuum/jtag/logs/system/modules/sentinel.log
- Pipeline logs: .continuum/jtag/logs/system/sentinels/{handle}/
- LoggerModule resolve_log_path handles sentinel categories

TypeScript commands now route to Rust:
- sentinel/run passes pipeline JSON to Rust
- sentinel/logs/* use RustCoreIPCClient typed methods
- sentinel/status, sentinel/save route through Rust

Deleted ~7,776 lines of TypeScript sentinel infrastructure:
- AgentSentinel, BuildSentinel, OrchestratorSentinel
- TaskSentinel, VisualSentinel, SentinelRunner
- SentinelWorkspace, SentinelTrigger, SentinelLogWriter
- Associated tests

sentinel.rs at 2096 lines - needs decomposition (ironic).
… processes

This is foundational infrastructure for sentinels and other spawned tasks
to execute ANY command (Rust or TypeScript) without knowing where it's
implemented.

New: runtime/command_executor.rs
- CommandExecutor struct with registry + WebSocket TS bridge
- execute() returns CommandResult
- execute_json() convenience method for most use cases
- Global executor initialized at startup
- TS bridge via WebSocket to JTAGSystemServer (port 9001)

Integration:
- Initialized in ipc/mod.rs after runtime is ready
- SentinelModule uses execute_json() for command steps
- Sentinel no longer needs to know if command is Rust or TS

API:
```rust
// From anywhere in continuum-core:
let result = runtime::execute_command("code/edit", params).await?;
let json = runtime::execute_command_json("any/command", params).await?;
```

This enables sentinels to call file editing, screenshot, and other
TypeScript commands that were previously unreachable from Rust.
- Replace WebSocket-based TypeScript command routing with Unix socket
- Use existing CommandRouterServer at /tmp/jtag-command-router.sock
- CommandRouterServer now uses getCommandsInterface() for proper routing
- Sentinel command steps can now execute both Rust and TypeScript commands

Architecture:
- Rust commands: Route via ModuleRegistry (direct, 0ms)
- TypeScript commands: Route via Unix socket → CommandRouterServer → CommandDaemon
- Browser commands (screenshot, etc.) require CLI routing (architectural constraint)

Tested with multi-step pipeline: health-check (Rust) + help (TypeScript) + shell
Copilot AI review requested due to automatic review settings February 14, 2026 23:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a complete Rust-based sentinel execution engine with universal command routing. The changes represent a significant architectural shift, moving pipeline execution from TypeScript to Rust and establishing a bidirectional command bridge between the two layers.

Changes:

  • New Rust CommandExecutor module for universal command routing (Rust-to-Rust direct, Rust-to-TypeScript via Unix socket)
  • Complete removal of TypeScript sentinel implementations (BuildSentinel, OrchestratorSentinel, VisualSentinel, TaskSentinel, AgentSentinel)
  • Refactored command implementations to delegate to Rust SentinelModule
  • Updated logging paths from .sentinel-workspaces/ to .continuum/jtag/logs/system/sentinels/
  • New comprehensive LoRA mesh distribution architecture documentation

Reviewed changes

Copilot reviewed 42 out of 43 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
command_executor.rs New universal command executor with socket-based TypeScript bridge
logger.rs Updated sentinel log paths, dual payload pattern support
sentinel.ts Complete TypeScript bindings for Rust SentinelModule
CommandRouterServer.ts Enhanced routing through JTAGSystemServer command interface
SentinelRunServerCommand.ts Simplified to fire-and-forget Rust delegation
sentinel/status, logs/* Migrated to Rust SentinelModule queries
Multiple .ts files Removed legacy TypeScript sentinel implementations
LORA-MESH-DISTRIBUTION.md New 830-line architecture document
Files not reviewed (1)
  • src/debug/jtag/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +129 to +144
static GLOBAL_EXECUTOR: std::sync::OnceLock<Arc<CommandExecutor>> = std::sync::OnceLock::new();

/// Initialize the global command executor (called once at startup)
pub fn init_executor(registry: Arc<ModuleRegistry>) {
let log = super::logger("command-executor");
let _ = GLOBAL_EXECUTOR.set(Arc::new(CommandExecutor::new(registry)));
log.info(&format!("Initialized (TS bridge: {})", TS_COMMAND_SOCKET));
}

/// Get the global command executor
/// Panics if not initialized - this is intentional, executor MUST be initialized at startup
pub fn executor() -> Arc<CommandExecutor> {
GLOBAL_EXECUTOR.get()
.expect("CommandExecutor not initialized - call init_executor() at startup")
.clone()
}
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global executor initialization using std::sync::OnceLock doesn't handle re-initialization gracefully. If init_executor is called twice, the second call is silently ignored (line 134 uses let _ =). This could mask bugs where the executor is initialized with the wrong registry.

Consider either:

  1. Panicking if already initialized (fail-fast)
  2. Returning a Result to indicate if initialization succeeded
  3. Adding a debug log when re-initialization is attempted

Copilot uses AI. Check for mistakes.
Comment on lines +563 to +572
// WorkerClient sends data nested under "payload" field, extract it
// ORMRustClient sends data at top level - support both patterns
let payload_value = if let Some(nested) = params.get("payload") {
nested.clone()
} else {
params.clone()
};

let payload: WriteLogPayload =
serde_json::from_value(params).map_err(|e| format!("Invalid payload: {e}"))?;
serde_json::from_value(payload_value).map_err(|e| format!("Invalid payload: {e}"))?;
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger payload extraction logic at lines 563-569 creates an inconsistency where the same data can be passed either nested under "payload" or at the top level. This dual-pattern support increases complexity and makes the API ambiguous for callers.

Consider standardizing on one pattern across the codebase, or document this behavior clearly in the command schema.

Copilot uses AI. Check for mistakes.
Comment on lines +122 to +128
// Get JTAGSystemServer instance
const { JTAGSystemServer } = await import('../../../system/core/system/server/JTAGSystemServer');
const system = JTAGSystemServer.instance;

const response: CommandResponse = {
success: true,
result
};
if (!system) {
throw new Error('JTAGSystemServer not initialized');
}
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CommandRouterServer now requires JTAGSystemServer to be initialized before it can route commands (line 123-128). However, there's no explicit initialization order guarantee or error handling if the system isn't ready yet. If a Rust worker tries to execute a command before JTAGSystemServer is fully initialized, it will fail with "JTAGSystemServer not initialized".

Consider adding initialization ordering documentation or a startup sequencing mechanism to ensure the server is ready before workers connect.

Copilot uses AI. Check for mistakes.
Comment on lines +54 to +60
// Use sentinel/run which spawns a task for the pipeline
const result = await rustClient.sentinelRun({
type: 'pipeline',
command: 'pipeline', // Internal: tells Rust this is a pipeline
args: [],
workingDir,
env: { PIPELINE_JSON: JSON.stringify(pipeline) },
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pipeline execution in sentinel/run (lines 55-61) passes the entire pipeline definition through an environment variable (PIPELINE_JSON). Large pipeline definitions could exceed environment variable size limits (typically 128KB on Linux, 32KB on Windows).

Consider passing the pipeline via a temporary file instead, or implementing a size check with an appropriate error message.

Suggested change
// Use sentinel/run which spawns a task for the pipeline
const result = await rustClient.sentinelRun({
type: 'pipeline',
command: 'pipeline', // Internal: tells Rust this is a pipeline
args: [],
workingDir,
env: { PIPELINE_JSON: JSON.stringify(pipeline) },
// Serialize pipeline and enforce a conservative env var size limit
const pipelineJson = JSON.stringify(pipeline);
const maxEnvSize = process.platform === 'win32' ? 32 * 1024 : 128 * 1024;
if (pipelineJson.length > maxEnvSize) {
throw new Error(
`Pipeline definition is too large to pass via environment variable (size=${pipelineJson.length} bytes, limit=${maxEnvSize} bytes). ` +
'Please reduce the pipeline size or use a different execution mechanism that does not rely on environment variables.'
);
}
// Use sentinel/run which spawns a task for the pipeline
const result = await rustClient.sentinelRun({
type: 'pipeline',
command: 'pipeline', // Internal: tells Rust this is a pipeline
args: [],
workingDir,
env: { PIPELINE_JSON: pipelineJson },

Copilot uses AI. Check for mistakes.
Comment on lines +238 to 247
// NOTE: Rust IPC expects 'command' field, not 'type'
// The JTAGRequest interface uses 'type' but ORMRustClient uses 'command'
// We need to include both for compatibility
const request: WorkerRequest<TReq> & { command: string } = {
id: generateUUID(),
type,
command: type, // Rust IPC looks for 'command' field
timestamp: new Date().toISOString(),
payload,
userId: userId ?? this.defaultUserId
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WorkerClient now duplicates the command in both 'type' and 'command' fields (lines 238-244) for Rust IPC compatibility. This creates redundancy and increases the chance of inconsistency if one field is updated but not the other.

Consider updating the Rust IPC layer to accept 'type' consistently, or create a mapping layer that doesn't require duplicating data in the request object.

Copilot uses AI. Check for mistakes.
Three root causes fixed in generate-command-schemas.ts:

1. extractDescription now finds NEAREST JSDoc block (not first),
   strips * prefixes, skips title-pattern lines like "Foo - Types"

2. readReadmeDescription reads first paragraph from command README.md
   as primary description source, falls back to cleaned JSDoc

3. deduplicateSchemas merges entries sharing a name — unions params,
   marks variant-only params optional, picks best description.
   sentinel/run: 7 entries → 1 with 26 merged params

Also: cleaned sentinel JSDoc, removed duplicate SentinelListParams
from SentinelLoadTypes.ts. Broken descriptions: 205 → 0.
Updated interface-level JSDoc across 35 command Types files that
had generic descriptions like "Parameters for X command" or missing
JSDoc entirely. Also fixed extractDescription to handle single-line
/** text */ JSDoc blocks. Result: 0/253 commands with weak descriptions.
…up stale refs

- Add persona autonomy philosophy and Three Pillars convergence (Sentinel + Genome + Academy)
- Rewrite Implementation Status for Rust-centric architecture, remove deleted TS class references
- Expand Olympics from 12 to 24 validation tasks across 12 categories
- Add TODO implementation roadmap with 6 prioritized phases (A-F)
- Clean up References section, remove dead links
…step signatures

New step types:
- Parallel: concurrent branch execution with context snapshots and failFast
- Emit: publish interpolated events on MessageBus for inter-sentinel composition
- Watch: block until matching event arrives (glob patterns, configurable timeout)
- Sentinel: execute nested pipelines inline (recursive composition)

Enhanced existing:
- Loop: 4 termination modes (count, while, until, continuous) with maxIterations safety limit
- Interpolation: named outputs ({{named.label.output}}), type-preserving JSON interpolation
- All 9 step handlers now take uniform PipelineContext for consistent registry/bus access

Pipeline engine is now ~90% complete with 9 step types covering sequential, conditional,
looping, parallel, event-driven, and nested composition patterns.
Tested all pipeline step types through unit tests and live execution:
- Shell: echo, nonzero exit, space-in-cmd passthrough, timeout, interpolation
- Condition: true/false branches, interpolated conditions, failing substeps
- Loop: count, while, until, continuous modes, iteration variable, safety limits
- Parallel: concurrent branches (timing-verified), failure propagation, multi-step
- Emit: bus publishing, event/payload interpolation, requires-bus guard
- Watch: event matching (exact/wildcard/segment), timeout, ignores non-matching
- Sentinel: nested pipelines, input inheritance/override, failure propagation
- Executor integration: linear pipelines, stop-on-failure, condition branching,
  loop+interpolation, parallel, emit+watch composition, nested sentinel,
  step output forwarding, empty pipeline, missing registry error

Also deleted 2 broken TypeScript test files referencing removed infrastructure
(SentinelExecutionLog, SentinelWorkspace).
…d, fix native tool passing

Phase 1: Extract AgentToolExecutor from PersonaToolExecutor
- Universal tool execution (corrections, loop detection, parsing, content cleaning)
- PersonaToolExecutor delegates to AgentToolExecutor, keeps persona-specific logic

Phase 2: ai/agent command — universal agentic loop
- generate → parse tool calls → execute tools → feed results → re-generate
- Model-adaptive: native JSON tools (Anthropic/OpenAI) or XML fallback (DeepSeek)
- Safety caps tiered by provider (25/10/5 iterations)
- Added to ADMIN_COMMANDS to prevent recursive persona self-invocation

Phase 3: Rust LLM step dual-mode routing
- agentMode=false (default): fast in-process Rust call to ai/generate
- agentMode=true: routes to TypeScript ai/agent via CommandExecutor IPC
- Added tools, agentMode, maxIterations fields to PipelineStep::Llm

Bug fixes discovered during verification:
- Fix NativeToolSpec serde: remove rename_all=camelCase that broke tool deserialization
  (input_schema was silently dropped because Rust expected inputSchema)
- Fix refreshToolDefinitions: pass includeDescription+includeSignature to list command
  (tool definitions had empty descriptions and no parameters)
…ommands

The ai_provider Rust module claims the "ai/" prefix, intercepting
TypeScript-implemented commands like ai/agent. Using the standard
command_executor::execute() caused infinite recursion (registry routes
back to ai_provider → stack overflow crashing continuum-core).

Added execute_ts/execute_ts_json to CommandExecutor that go directly
to the TypeScript Unix socket, bypassing the Rust ModuleRegistry.
Updated ai_provider.rs and sentinel llm.rs to use these methods.

Verified: 103 sentinel tests pass, sentinel agentMode LLM step
completes end-to-end (Claude calls data/list tool, returns results).
Phase 1 of god class decomposition. PersonaResponseGenerator.calculateSimilarity,
jaccardSimilarity, checkSemanticLoop and PersonaMessageEvaluator.computeTextSimilarity
replaced with Rust IPC calls through RustCognitionBridge.

New: persona/text_analysis module (similarity.rs, types.rs) with 20 unit tests.
New: cognition/text-similarity and cognition/check-semantic-loop IPC commands.
Net: -164 lines TS algorithm code, +642 lines (Rust impl + tests + bridge + IPC).
…, truncated tool, semantic)

Adds garbage_detection.rs (8 checks ported from GarbageDetector.ts) and
loop_detection.rs (per-persona DashMap state) to the Rust text_analysis module.
PersonaResponseGenerator drops ~230 lines of inline validation, replaced by
a single cognition/validate-response IPC call returning ValidationResult.

Fix UTF-8 panic: find_consecutive_repeat now uses byte-level comparison
instead of string slicing, preventing char boundary panics on emoji/multibyte.
…s all 13 Rust modules

- New utils/params.rs: Params<'a> wrapper with typed extraction methods (str, uuid, u64, f64, bool, json<T>, aliases)
- Migrated all 13 ServiceModule implementations from manual params.get().and_then() chains to Params helper
- Eliminated ~160 manual extraction patterns across code.rs, voice.rs, memory.rs, ai_provider.rs, channel.rs, search.rs, mcp.rs, agent.rs, embedding.rs, models.rs, sentinel/mod.rs, sentinel/logs.rs, cognition.rs
- Phase 3 text_analysis: mention_detection.rs, response_cleaning.rs, validation.rs moved from TS to Rust
- Net: -726 lines, zero params.get() in IPC command handlers (3 intentional non-IPC uses remain)
- Add CommandResult::json() helper to replace serde_json::to_value().unwrap() pattern
- Fix 15 to_value unwraps in data.rs, 17 lock/serde unwraps in logger.rs
- Replace all Mutex/RwLock .lock().unwrap() with poison-recovering .unwrap_or_else()
- Fix NaN-unsafe partial_cmp().unwrap() in embedding.rs, metrics.rs
- Fix child.stdout/stderr.take().unwrap() in sentinel executor
- Fix OnceLock/Array shape unwraps in voice STT/TTS modules
- Fix FFI unwraps with proper error returns
- Harden Params helper: u32 overflow protection, add bool_opt/i64_or/f64/f32
- Static LazyLock regexes for agent.rs, interpolation.rs (was per-call)
- Remove duplicate regex in garbage_detection, dead code in response_cleaning
- Compress channel.rs InboxMessage parsing from 50 lines to 14 via Params

30 files changed, 725 tests pass, zero warnings
- Delete GarbageDetector.ts (488 lines) — replaced by Rust garbage_detection.rs
- Remove dead checkResponseRedundancy method + disabled redundancy gate (103 lines)
- Fix ai/sleep: add params.userId to identity chain (tool executor sets this)
Three duplicate type systems (Rust ai/types.rs, TS AIProviderTypesV2.ts,
TS AdapterTypes.ts) caused a production bug where --provider="groq"
silently routed to DeepSeek because TS used preferredProvider while
Rust expected provider.

- Fix Rust u64 fields generating bigint: add #[ts(type = "number")]
- Rewrite AIProviderTypesV2.ts as re-export layer from generated types
- Rewrite AdapterTypes.ts to re-export from unified source
- Rename preferredProvider → provider across 15 files
- Rename responseTime → responseTimeMs across 30 files
- Rename supportsFunctions → supportsTools across 15 adapter configs
- Simplify AIProviderRustClient: remove RustAIResponse, direct passthrough
- Fix ToolResult field names: tool_use_id → toolUseId, is_error → isError
- Fix ContentPart tool_result is_error: boolean | null compatibility

Verified: clean build, npm start, ping healthy, provider routing works
for both anthropic and groq, AI personas responding in chat.
SystemDaemon: ORM returns plain objects, not class instances.
configCache.get() failed because the POJO has no prototype methods.
Fix: Object.assign(new SystemConfigEntity(), data) to hydrate.

PersonaAutonomousLoop: 10+ personas each polling tasks every 10s
= 1+ query/second hammering an empty collection, causing cascading
timeouts. Fix: increase intervals (60s/60s/120s) + stagger starts
with random 0-15s offset to prevent thundering herd.
Replace 42 TypeScript setIntervals (14 personas x 3 timers) with ONE
Rust tick loop in ChannelModule. All background scheduling now runs in
Rust: task polling, self-task generation, and training readiness checks.

Rust changes:
- Add tick_interval field to ModuleConfig (ServiceModule trait)
- Wire Runtime.start_tick_loops() to spawn tokio tasks for tick-enabled modules
- ChannelModule.tick() polls tasks, generates self-tasks, checks training
- New SelfTaskGenerator struct in persona module (451 lines, 5 tests)

TS cleanup (-937 lines):
- PersonaAutonomousLoop: 345 → 188 lines (thin signal-based service loop)
- Delete PersonaCentralNervousSystem.ts, CNSFactory.ts, SelfTaskGenerator.ts
- Remove dead CNS callback methods from PersonaUser.ts
- All cognition preserved: Rust engine handles priority, fast-path, scheduling
Expose tick loop configuration to TypeScript via ts-rs generated type.
Runtime tick loop now re-reads interval each iteration (sleep-based,
not fixed interval), allowing dynamic adjustment via channel/tick-config.

- ChannelTickConfig struct with tick_interval_ms, enable flags, threshold
- channel/tick-config command for runtime get/set (100ms floor)
- Runtime tick loop uses sleep + config re-read (supports dynamic cadence)
- Generated ChannelTickConfig.ts in shared/generated/runtime/
Consolidates response_cap, mention detection, rate limiting, sleep mode,
directed mention filter, and fast-path decision into single cognition/full-evaluate
Rust command. Removes ~250 lines of sequential async TS gating logic.

New Rust module persona/evaluator.rs with 19 tests. Dual-write pattern keeps
TS sleep/rate-limiter state in sync during migration. AI personas verified
responding through new gate path.
New persona/model_selection.rs with domain-to-trait mapping and
adapter priority chain (trait → current → any → base_model).
12 Rust tests. Replaces getEffectiveModel() + determineRelevantTrait()
in PersonaResponseGenerator (~75 lines → 10 lines).

Adapter registry synced from TS genome at init. cognition/select-model
and cognition/sync-adapters IPC commands added.
… Rust

ONE async Rust IPC call replaces 3 separate sync TS calls (parse + correct + strip).
68 new Rust tests covering Anthropic XML, function-style, bare JSON, markdown,
old-style XML formats plus parameter correction, tool name codec, and integration.
GenomePagingEngine with 22 tests: eviction scoring (age/priority*10),
critical adapter protection (priority>0.9), memory budget enforcement,
and skill activation decisions. Rust decides what to evict/load,
TypeScript executes GPU operations. PersonaGenome now delegates to
Rust when bridge is available, falls back to local logic for tests.
Phase 5 of persona decision migration. The post-inference adequacy check
(did another AI already answer?) now runs as a single Rust IPC call instead
of N separate textSimilarity calls. Rust handles length filtering (>100 chars)
and Jaccard n-gram similarity (>0.2 threshold) internally.

All 5 phases complete:
1. Unified evaluation gate (5 TS gates → 1 Rust call)
2. Model selection (4-tier priority chain)
3. Tool call parsing (5 format adapters)
4. Genome paging (LRU eviction + memory budget)
5. Post-inference adequacy (batch similarity check)
…t analysis optimization

Phase A: Unified Per-Persona State
- Created PersonaCognition struct (persona/unified.rs) aggregating engine, inbox,
  rate_limiter, sleep_state, adapter_registry, genome_engine
- CognitionState now holds single DashMap<Uuid, PersonaCognition> instead of 7 separate maps
- Single lock per command, atomic cross-field access, better cache locality
- Updated all ~18 command handlers across cognition module, channel module, and IPC server

Phase B: Eliminate Dual Response Tracking
- Removed TS rateLimiter.trackResponse() from PersonaMessageEvaluator (Rust sole authority)
- Added cognition/has-evaluated and cognition/mark-evaluated IPC commands
- Added hasEvaluatedMessage/markMessageEvaluated to RustCognitionBridge
- PersonaUser chat dedup stays TS-local (separate concern from CognitionEngine pipeline dedup)

Phase C: Dead Code Removal (-250 lines net)
- Removed dead evaluated_messages HashSet from RateLimiterState in evaluator.rs
- Removed dead AIDecisionService imports from PersonaMessageEvaluator
- Gutted TS RateLimiter: removed trackResponse, isRateLimited, getResponseCount,
  hasReachedResponseCap, getRateLimitInfo, resetRoom (all now in Rust)
- Updated RateLimiter.test.ts to match (337→112 lines)
- Removed dead comment blocks and fallback code

Phase D: Text Analysis Optimization
- Added build_word_ngrams() and jaccard_from_sets() for compute-once reuse
- check_semantic_loop: response ngrams computed once, reused across N history comparisons
- check_response_adequacy: original ngrams computed once, reused across N response comparisons
Tool definitions and formatted memories were appended to the system prompt
AFTER the RAG budget was calculated, causing unbounded context growth that
crashed local models with NaN/Inf errors and emergency truncation.

- New ToolDefinitionsSource (priority 45, 10% budget): handles native JSON
  tool specs and XML tool definitions within the budget system
- SemanticMemorySource now produces systemPromptSection with formatted
  memories instead of PersonaResponseGenerator doing it as a bypass
- PersonaResponseGenerator: deleted 3 bypass blocks (~100 lines), now reads
  tool specs and memories from RAG metadata only
- RAGSourceContext/RAGBuildOptions carry provider + toolCapability so
  tool-aware sources know what format to produce
- ToolFormatAdapter: default toolCapability changed from 'none' to 'xml'
  so all models get tools (budget truncation handles tight contexts)
- Budget rebalanced across 11 sources to total 100%
- Fixed <command> → <cmd> bug in PersonaToolDefinitions example
PersonaIdentitySource: 5-line stub → full rich prompt with room context,
member list, self-awareness block, response format rules, meta-awareness.
Previously AIs didn't know who was in the room or how to format responses.

ToolDefinitionsSource: 4-tier priority ordering (critical > essential > rest)
with sub-ordering by prefix (chat > code > decision > data > ai). Budget
truncation now drops least-important tools instead of keeping alphabetically
first. Native providers get 3000 token minimum (17 tools instead of 2).
Fix protocol mismatch: when native-capable providers (Groq, Together,
Fireworks) output tool calls in text instead of structured tool_calls,
the system now synthesizes native format instead of falling to XML path.

- PersonaResponseGenerator + AiAgentServerCommand: branching fix —
  text-parsed tool calls from native providers route through native
  protocol (tool_use + tool_result content blocks), not XML
- ToolFormatAdapter: coerceParamsToSchema() fixes type mismatches
  (string "true" → boolean) so APIs don't reject tool_use blocks
- Rust parsers.rs: improved function-style format detection
- PromptCapture integration for replay/debugging
- RAG audit dump replaced with concise structured logging
- PromptCapture: JSONL-based prompt/response capture for replay and
  debugging — captures complete LLM context per persona per inference
- PersonaIdentitySource: richer identity prompts with personality traits
- ConversationHistorySource: improved message windowing and context
- ToolDefinitionsSource: budget-aware tool injection refinements
- ChatRAGBuilder: cleaner composition pipeline
- Rust garbage_detection: expanded patterns for AI-generated noise
- CLAUDE.md: documentation updates
Candle inference:
- Remove 3 artificial limits from candle_adapter.rs (temp clamp, char/token truncation)
- Improve quantized generation with logits sanitization and NaN detection
- Set Candle context windows to 1400 tokens (empirically validated safe threshold at 1000)

RAG budget (ChatRAGBuilder):
- Small-context guard: skip non-essential system prompt injections for models with <1500 token budget
- Provider-aware context window lookups in calculateSafeMessageCount and calculateAdjustedMaxTokens
- Reduces Candle persona prompts from ~11K chars to ~3.8K chars (~860 tokens, safely under 1000 threshold)

Tool calling:
- Rust parser improvements for text-embedded tool calls (function= format from Groq/Together)
- Tool definitions overhaul with proper parameter names and descriptions
- ToolFormatAdapter improvements for cross-provider compatibility

AI provider:
- registerLocalModels() in AIProviderDaemonServer for proper Candle model registration
- Provider-aware model config propagation through RAG pipeline

RAG sources:
- CodeToolSource, ConversationHistorySource improvements
- PersonaIdentitySource budget-aware progressive inclusion
- ToolDefinitionsSource refinements

Verified: All 4 Candle personas produce coherent output, all cloud providers (Groq, Together, DeepSeek, Fireworks) responding correctly. NaN threshold test confirms 1000-token boundary.
…isions

ModelRegistry now uses provider:modelId as internal key instead of just
modelId. Same model on different providers (e.g., Llama-3.1-8B at 1400
tokens on Candle vs 131072 on Together) no longer collide via last-writer-wins.

- ModelRegistry: provider-scoped keys, secondary index, getAll(), scoped
  and unscoped resolution (unscoped returns largest context window)
- ModelContextWindows: added provider? param to all 7 exported functions
- Fixed getInferenceSpeed bug: local providers (candle/ollama/sentinel)
  no longer incorrectly return 1000 TPS when found in registry
- ChatRAGBuilder: deleted 3 isLocal workaround blocks, removed
  MODEL_CONTEXT_WINDOWS import, passes provider to all utility calls
- RAGBudgetManager: provider param on constructor + static methods
- GovernanceSource, ActivityContextSource: pass provider to lookups
- RAG budget/load commands + MediaResize: added provider to params
- 12 new tests (37 total passing) covering scoped lookups, ambiguity
  resolution, inference speed classification, and isSlowLocalModel
…ization

Defines the complete capability vocabulary for algorithmic model selection:

- QuantFormat enum: FP32 to Q2_K, GPTQ, AWQ
- WeightFormat enum: GGUF, SafeTensors, MLX, PyTorch
- AdapterMethod enum: LoRA, QLoRA, DoRA, IA3, prefix/prompt tuning, full
- AdapterTarget enum: attention Q/K/V/O, MLP gate/up/down, embedding, lm_head
- InferenceRuntime enum: Candle, llama.cpp, MLX, Ollama, vLLM, Transformers
- Accelerator enum: Metal, CUDA, ROCm, CPU, cloud

Composite types:
- QuantizationProfile: format + bits + can-train-in-quantized (QLoRA flag)
- LoRAProfile: max/recommended rank, alpha, concurrent adapters, stacking
- FineTuningProfile: supported methods + LoRA config + gradient checkpointing
- HardwareProfile: VRAM (inference + training), measured TPS, offload layers
- ModelAdapterProfile: top-level composite attached to ModelMetadata

Query helpers: isFineTunable(), supportsLoRA(), supportsAdapterStacking(),
estimateAdapterVramMB(), fitsInVram()

21 new tests (58 total) covering all helpers, enum values, and registry
integration (filtering models by fine-tunability across providers).
@joelteply joelteply changed the title Rust Sentinel Module: Complete pipeline execution engine with universal command routing Rust Sentinel Module + Provider-Scoped ModelRegistry + ModelCapabilities Type System Feb 17, 2026
@joelteply joelteply merged commit 75a1bcb into main Feb 17, 2026
2 of 5 checks passed
@joelteply joelteply deleted the feature/rust-sentinel-module branch February 17, 2026 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant