Conversation
📝 WalkthroughWalkthroughThis change replaces the engine's heartbeat-based health monitoring mechanism with a dedicated health-probe mechanism. A new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@vllm/engine/multiprocessing/engine.py`:
- Around line 232-235: The error message in the RPC request handling branch
prints the literal "{request}" because the f-string prefix is missing; update
the raise in the else branch that currently references RPCHealthRequest and
_handle_health_request to use an f-string (or otherwise format the request) so
the actual request object is interpolated into the ValueError message (e.g.,
change "Unknown RPCRequest Type: {request}" to an f-string including request).
- Around line 282-288: In _handle_health_request, when self._errored_with is not
None the code calls self._send_unhealthy(self._errored_with) but then continues
to call self.engine.check_health() which can mask or overwrite the original
error; modify _handle_health_request so that after calling
_send_unhealthy(self._errored_with) it immediately returns (i.e., add an early
return) to avoid further health checks and preserve the original error context;
reference: function _handle_health_request, attribute _errored_with, methods
_send_unhealthy, engine.check_health, and _send_healthy.
🧹 Nitpick comments (1)
vllm/engine/multiprocessing/__init__.py (1)
46-48: Consider adding@dataclassdecorator for consistency.Other RPC request types in this file (
RPCProcessRequest,RPCError,RPCAbortRequest) use the@dataclassdecorator. WhileRPCHealthRequesthas no fields, adding the decorator would maintain consistency and provide standard__eq__,__repr__, etc.Suggested change
+@dataclass class RPCHealthRequest: pass
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
tests/mq_llm_engine/test_error_handling.pyvllm/engine/multiprocessing/__init__.pyvllm/engine/multiprocessing/client.pyvllm/engine/multiprocessing/engine.py
🧰 Additional context used
🧬 Code graph analysis (3)
tests/mq_llm_engine/test_error_handling.py (4)
vllm/engine/multiprocessing/client.py (1)
abort(340-345)vllm/engine/protocol.py (1)
abort(59-64)vllm/engine/async_llm_engine.py (1)
abort(969-985)vllm/engine/multiprocessing/__init__.py (1)
MQEngineDeadError(20-21)
vllm/engine/multiprocessing/engine.py (6)
vllm/engine/multiprocessing/__init__.py (1)
RPCError(35-38)vllm/engine/multiprocessing/client.py (1)
check_health(351-358)vllm/engine/llm_engine.py (1)
check_health(1637-1640)vllm/engine/protocol.py (1)
check_health(91-93)vllm/engine/async_llm_engine.py (1)
check_health(1026-1034)vllm/executor/cpu_executor.py (1)
check_health(286-290)
vllm/engine/multiprocessing/client.py (1)
vllm/engine/multiprocessing/__init__.py (5)
RPCError(35-38)RPCHealthRequest(46-47)RPCProcessRequest(25-31)RPCStartupRequest(50-51)RPCStartupResponse(55-56)
🔇 Additional comments (11)
vllm/engine/multiprocessing/__init__.py (1)
59-60: LGTM!The
RPC_REQUEST_Tunion is correctly updated to includeRPCHealthRequest, aligning with the new health-check mechanism used by the engine and client.vllm/engine/multiprocessing/engine.py (3)
18-20: LGTM!The import statement correctly includes
RPCHealthRequestto support the health-check mechanism.
94-96: LGTM!The health socket is correctly configured as a PUSH socket to send health status messages to the client.
296-303: LGTM!The
_send_healthyand_send_unhealthymethods correctly use thehealth_socketto communicate health status back to the client.vllm/engine/multiprocessing/client.py (4)
23-25: LGTM!The import statement correctly includes
RPCHealthRequestto support the health-check mechanism on the client side.
98-100: LGTM!The health socket is correctly configured as a PULL socket to receive health status acknowledgments from the engine.
127-154: LGTM!The health check loop correctly implements the probe mechanism:
- Sends
RPCHealthRequestwhen the poll times out (periodic probing)- Handles unprompted health status messages from the server
- Properly catches
CancelledErrorfor graceful shutdown
236-237: LGTM!The setup correctly starts the
run_check_health_looptask with the appropriate timeout.tests/mq_llm_engine/test_error_handling.py (3)
156-161: LGTM!The refactored test correctly schedules the abort to occur during generation, ensuring proper timing for error observation. Using an async task with a 2-second delay provides deterministic timing for the test scenario.
167-171: LGTM!Increasing
max_tokensto 2000 ensures the generation runs long enough for the scheduled abort to trigger during execution, making the test more reliable.
175-176: LGTM!Awaiting
abort_taskensures the abort completes and the error state is fully propagated before the subsequentcheck_health()assertion, preventing race conditions in the test.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| elif isinstance(request, RPCHealthRequest): | ||
| self._handle_health_request() | ||
| else: | ||
| raise ValueError("Unknown RPCRequest Type: " | ||
| f"{type(request)}") | ||
| raise ValueError("Unknown RPCRequest Type: {request}") |
There was a problem hiding this comment.
Missing f-string prefix causes incorrect error message.
Line 235 is missing the f prefix, so {request} will be printed literally instead of the actual request value.
Proposed fix
- raise ValueError("Unknown RPCRequest Type: {request}")
+ raise ValueError(f"Unknown RPCRequest Type: {request}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| elif isinstance(request, RPCHealthRequest): | |
| self._handle_health_request() | |
| else: | |
| raise ValueError("Unknown RPCRequest Type: " | |
| f"{type(request)}") | |
| raise ValueError("Unknown RPCRequest Type: {request}") | |
| elif isinstance(request, RPCHealthRequest): | |
| self._handle_health_request() | |
| else: | |
| raise ValueError(f"Unknown RPCRequest Type: {request}") |
🤖 Prompt for AI Agents
In `@vllm/engine/multiprocessing/engine.py` around lines 232 - 235, The error
message in the RPC request handling branch prints the literal "{request}"
because the f-string prefix is missing; update the raise in the else branch that
currently references RPCHealthRequest and _handle_health_request to use an
f-string (or otherwise format the request) so the actual request object is
interpolated into the ValueError message (e.g., change "Unknown RPCRequest Type:
{request}" to an f-string including request).
| def _handle_health_request(self): | ||
| if self._errored_with is not None: | ||
| self._send_unhealthy(self._errored_with) | ||
|
|
||
| # Check for life of the main loop | ||
| elif time.time() - self._last_alive_time > self.last_alive_threshold: | ||
| self._send_unhealthy(RuntimeError("Engine loop has died")) | ||
|
|
||
| else: | ||
| # Otherwise- check health of the engine | ||
| # self.engine.check_health() raises on unhealthy | ||
| try: | ||
| self.engine.check_health() | ||
| self._send_healthy() | ||
| except Exception as e: | ||
| self._set_errored(e) | ||
| self._send_unhealthy(e) | ||
| # Raises error if unhealthy. | ||
| self.engine.check_health() | ||
| self._send_healthy() |
There was a problem hiding this comment.
Missing early return after sending unhealthy status.
If self._errored_with is not None, the method sends an unhealthy status but then continues to call self.engine.check_health(). This could either succeed (misleading) or raise an exception that overwrites the original error context. Add a return after _send_unhealthy.
Proposed fix
def _handle_health_request(self):
if self._errored_with is not None:
self._send_unhealthy(self._errored_with)
+ return
# Raises error if unhealthy.
self.engine.check_health()
self._send_healthy()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _handle_health_request(self): | |
| if self._errored_with is not None: | |
| self._send_unhealthy(self._errored_with) | |
| # Check for life of the main loop | |
| elif time.time() - self._last_alive_time > self.last_alive_threshold: | |
| self._send_unhealthy(RuntimeError("Engine loop has died")) | |
| else: | |
| # Otherwise- check health of the engine | |
| # self.engine.check_health() raises on unhealthy | |
| try: | |
| self.engine.check_health() | |
| self._send_healthy() | |
| except Exception as e: | |
| self._set_errored(e) | |
| self._send_unhealthy(e) | |
| # Raises error if unhealthy. | |
| self.engine.check_health() | |
| self._send_healthy() | |
| def _handle_health_request(self): | |
| if self._errored_with is not None: | |
| self._send_unhealthy(self._errored_with) | |
| return | |
| # Raises error if unhealthy. | |
| self.engine.check_health() | |
| self._send_healthy() |
🤖 Prompt for AI Agents
In `@vllm/engine/multiprocessing/engine.py` around lines 282 - 288, In
_handle_health_request, when self._errored_with is not None the code calls
self._send_unhealthy(self._errored_with) but then continues to call
self.engine.check_health() which can mask or overwrite the original error;
modify _handle_health_request so that after calling
_send_unhealthy(self._errored_with) it immediately returns (i.e., add an early
return) to avoid further health checks and preserve the original error context;
reference: function _handle_health_request, attribute _errored_with, methods
_send_unhealthy, engine.check_health, and _send_healthy.
PR_002
Summary by CodeRabbit
Refactor
Tests
✏️ Tip: You can customize this high-level summary in your review settings.