-
Notifications
You must be signed in to change notification settings - Fork 66
Add build timeout handling and TIMED_OUT state #204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add build timeout handling and TIMED_OUT state #204
Conversation
|
Hello @shiv-tyagi, |
Fixes ArduPilot#201. Signed-off-by: Sahil <lakhmanisahil8@gmail.com>
- Add CBS_BUILD_TIMEOUT_SEC environment variable (defaults to 900s/15min) - Use env var in both builder and progress_updater for consistency - Move state transition logic from manager to progress_updater - Fix time_started persistence by passing build_info parameter - Add subprocess timeout protection in builder - Remove common/config.py in favor of environment variable - Add example timeout value (120s) in .env.sample for testing - Update docker-compose.yml to pass timeout env var to containers
9c6c662 to
02ee09e
Compare
|
Hello @shiv-tyagi, I’ve updated the PR as per your review:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is taking a good shape now. Please address the comments I have posted. One more thing, in the web/static/js/index.js file, there is a piece of code which decides the colour of labels for the build states. Put the timed out state in red along with FAILURE and ERROR states.
| new_state (BuildState): The new state to set for the build. | ||
| """ | ||
| build_info = self.get_build_info(build_id=build_id) | ||
| if build_info is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't do this. Not in scope of this PR.
| # Set time_started when transitioning to RUNNING | ||
| if current_state != BuildState.RUNNING and new_state == BuildState.RUNNING: | ||
| build_info.time_started = time.time() | ||
| self.logger.info( | ||
| f"Build {build_id} transitioned to RUNNING state at " | ||
| f"{build_info.time_started}" | ||
| ) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this from here and move to __refresh_running_build_state method.
| ) | ||
|
|
||
| # Check for timeout | ||
| if build_info.time_started is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set time_started here if not set.
if buid_info.time_started is None:
set time here
-- check timeout logic ---
| build_id: str, | ||
| new_state: BuildState) -> None: | ||
| new_state: BuildState, | ||
| build_info: BuildInfo=None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't pass this here please. Only do what is really needed in this PR. Your things will still work if you don't make this change.
Tip for future open source contributions, do not touch what is not needed. That makes reviewing harder.
Fixes #201
Summary
This PR implements a 15-minute timeout for builds to prevent them from remaining indefinitely in the
RUNNINGstate. It introduces a newTIMED_OUTbuild state and enforces timeout handling independently in the builder and progress updater, following maintainer guidance.Changes Made
1. Core Timeout Infrastructure
BuildState.TIMED_OUTto represent builds terminated due to timeoutBUILD_TIMEOUT_SECONDS = 900) incommon/config.pytime_started_runningtoBuildInfoto track when a build actually begins executionerror_messagetoBuildInfoto store timeout and failure details2. Builder (
builder/builder.py)configure,clean,build) usingtimeout=BUILD_TIMEOUT_SECONDSsubprocess.TimeoutExpired:__check_if_timed_out()to detect if the progress updater has already marked a build asTIMED_OUT3. Build Manager (
build_manager/manager.py)mark_build_timed_out()to safely transition builds toTIMED_OUTSUCCESS,FAILURE)time_started_runningwhen a build enters theRUNNINGstateBuildInfo.to_dict()usinggetattr()to maintain backward compatibility with existing Redis entries4. Progress Updater (
build_manager/progress_updater.py)__check_build_timeout()invoked from the existing periodic update looptime_started_running(nottime_created) for accuracytime_started_runningis not yet availableTIMED_OUTonce the timeout threshold is exceededTIMED_OUTin state and progress update paths5. Web API (
web/app.py)/api/builds/<build_id>/statusendpoint for lightweight pollingget_all_builds()by skipping and logging individual build errors instead of failing the entire requestremote_name/commit_ref)Implementation Details
As suggested by the maintainer, timeout handling is implemented independently in two places:
Builder subprocess timeout
Each subprocess call is bounded by
BUILD_TIMEOUT_SECONDS. If exceeded, the process is terminated and the build is aborted.Progress updater timeout detection
The existing periodic task checks whether any
RUNNINGbuild has exceeded the timeout since entering the running state and marks it asTIMED_OUT.Both mechanisms coordinate via
BuildState.TIMED_OUT:This preserves separation of responsibilities while keeping behavior consistent.
Testing
Backward Compatibility
time_started_running,error_message) are accessed viagetattr()Future Work (not included)
Supporting Images (checked for 2 minutes)