Skip to content

❤️ (VCS/VTC) Add Lifestream Guard for failures #23

@OppaAI

Description

@OppaAI

⚠️ Issue: Lifestream Stagnation (Zombie Data Persistence)

The system currently assumes that as long as the PumpNode is alive, the data coming from the Jetson is fresh. However, if the jtop daemon hangs or the hardware communication bus (I2C/SMBus) stalls, jtop.ok() may return False, or the data values may simply stop updating while the loop continues to run.

🎯 Location:

robot/vtc/pump.py -> LO flow timer (Heartbeat check)
robot/vtc/pump.py -> state machine transitions

🦠 Symptoms:

  • The Regulator continues to pulse the heart based on old "frozen" temperatures.
  • UI vitals appear "flatlined" but at a normal value (e.g., exactly 42.0°C for 10 minutes).
  • No warning is issued when the underlying hardware telemetry source is disconnected.

🩺 Diagnosis:

A Watchdog Pattern is missing. A "Lifestream Guard" is required to monitor the health of the connection to the Jetson hardware. If the source (jtop) becomes unresponsive or reports an unhealthy status, the Pump must stop pretending everything is fine and signal a system-wide warning.

💡 Proposal:

The "Lifestream Guard" Watchdog
Implement a health check within the LO flow (Low Frequency) to validate the hardware connection and manage state transitions.

  • Consecutive Failure Counter: Track how many times jtop.ok() returns False.
  • Grace Period: Allow for 1 or 2 missed ticks (to account for momentary CPU spikes), but trigger an alert after N failures.
  • State Demotion: If the threshold is hit, demote self.state from RUNNING to DEGRADED.
  • Data Invalidation: When in a DEGRADED state, the Pump should inject None into the deques to ensure the Regulator and Display know the data is no longer trustworthy.
  • Connect with issue ❤️ (VCS/VTC) None vs Value Consistency #22; can be implemented together.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

Status

Todo

Relationships

None yet

Development

No branches or pull requests

Issue actions