Skip to content

Conversation

@lakhmanisahil
Copy link

@lakhmanisahil lakhmanisahil commented Jan 5, 2026

Fixes #201

Summary

This PR implements a 15-minute timeout for builds to prevent them from remaining indefinitely in the RUNNING state. It introduces a new TIMED_OUT build state and enforces timeout handling independently in the builder and progress updater, following maintainer guidance.


Changes Made

1. Core Timeout Infrastructure

  • Added BuildState.TIMED_OUT to represent builds terminated due to timeout
  • Introduced a shared timeout constant (BUILD_TIMEOUT_SECONDS = 900) in common/config.py
  • Added time_started_running to BuildInfo to track when a build actually begins execution
  • Added error_message to BuildInfo to store timeout and failure details

2. Builder (builder/builder.py)

  • Enforced timeout on all build subprocess steps (configure, clean, build) using timeout=BUILD_TIMEOUT_SECONDS
  • Added handling for subprocess.TimeoutExpired:
    • Terminates the running process
    • Aborts the build cleanly
    • Records a clear timeout error message
  • Added __check_if_timed_out() to detect if the progress updater has already marked a build as TIMED_OUT
  • Builder checks timeout state between build steps for early termination
  • Skips archive generation for timed-out builds
  • Ensures cleanup runs even when a build times out or fails

3. Build Manager (build_manager/manager.py)

  • Added mark_build_timed_out() to safely transition builds to TIMED_OUT
  • Prevents overriding terminal states (SUCCESS, FAILURE)
  • Automatically records time_started_running when a build enters the RUNNING state
  • Extended BuildInfo.to_dict() using getattr() to maintain backward compatibility with existing Redis entries

4. Progress Updater (build_manager/progress_updater.py)

  • Added __check_build_timeout() invoked from the existing periodic update loop
  • Timeout is measured from time_started_running (not time_created) for accuracy
  • Handles edge cases where time_started_running is not yet available
  • Marks builds as TIMED_OUT once the timeout threshold is exceeded
  • Added handling for TIMED_OUT in state and progress update paths
  • Includes clear logging for timeout detection

5. Web API (web/app.py)

  • Added /api/builds/<build_id>/status endpoint for lightweight polling
  • Returns build state, progress, and timeout error information
  • Improved robustness of get_all_builds() by skipping and logging individual build errors instead of failing the entire request
  • Updated API usage to align with recent changes (remote_name / commit_ref)

Implementation Details

As suggested by the maintainer, timeout handling is implemented independently in two places:

  1. Builder subprocess timeout
    Each subprocess call is bounded by BUILD_TIMEOUT_SECONDS. If exceeded, the process is terminated and the build is aborted.

  2. Progress updater timeout detection
    The existing periodic task checks whether any RUNNING build has exceeded the timeout since entering the running state and marks it as TIMED_OUT.

Both mechanisms coordinate via BuildState.TIMED_OUT:

  • Either mechanism may trigger the timeout
  • Builder exits early if the progress updater has already marked the build as timed out
  • No direct coupling between the two workflows

This preserves separation of responsibilities while keeping behavior consistent.


Testing

  • Built and tested locally using Docker
  • Verified subprocess timeouts terminate long-running builds
  • Verified progress updater correctly marks timed-out builds
  • Confirmed both mechanisms operate independently
  • Verified timed-out builds do not generate archives
  • Confirmed timeout errors are logged and exposed via API
  • No circular imports or startup errors observed

Backward Compatibility

  • New fields (time_started_running, error_message) are accessed via getattr()
  • No schema or Redis migration required
  • Existing builds continue to function normally

Future Work (not included)

  • Make timeout configurable via environment variable
  • UI support for retrying timed-out builds

Supporting Images (checked for 2 minutes)

Screenshot from 2026-01-05 22-47-47 Screenshot from 2026-01-05 22-44-02 Screenshot from 2026-01-05 22-43-46

@lakhmanisahil
Copy link
Author

Hello @shiv-tyagi,
I’ve reviewed all the suggested changes. I’ll address them and push an update shortly.

@lakhmanisahil lakhmanisahil force-pushed the feat/add-build-timeout branch from 9c6c662 to 02ee09e Compare January 16, 2026 18:22
@lakhmanisahil
Copy link
Author

lakhmanisahil commented Jan 16, 2026

Hello @shiv-tyagi,

I’ve updated the PR as per your review:

  • Introduced CBS_BUILD_TIMEOUT_SEC (default 900s / 15 min) iand removed common/config.py.
  • Used the env var consistently in progress_updater.py and builder.py.
  • time_started is now set only on PENDING → RUNNING in the progress updater and persisted correctly.
  • Timeout detection lives entirely in the progress updater; the builder only executes subprocesses.
  • Updated docker-compose.yml and added CBS_BUILD_TIMEOUT_SEC=120 to .env.sample for testing.
  • Added optional build_info to update_build_progress_state() so the progress updater can persist time_started directly, avoiding Redis refetches and ensuring correct timestamp storage.

Copy link
Member

@shiv-tyagi shiv-tyagi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is taking a good shape now. Please address the comments I have posted. One more thing, in the web/static/js/index.js file, there is a piece of code which decides the colour of labels for the build states. Put the timed out state in red along with FAILURE and ERROR states.

Copy link
Member

@shiv-tyagi shiv-tyagi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job @lakhmanisahil! Can you confirm if you have tested this?

@shiv-tyagi
Copy link
Member

One more thing, your commits need to be organised. You should have just four commits,

  • "web: xxx"
  • "builder: yyy"
  • "build_manager: zzz"
  • "docker-compose.yml: abc"

Where xxx, yyy, zzz and abc are descriptions about what your commit does to those modules.

https://ardupilot.org/dev/docs/git-interactive-rebase.html
Use this wiki for help.

@lakhmanisahil
Copy link
Author

Good job @lakhmanisahil! Can you confirm if you have tested this?

Yes, i have tested it with different builds and timeout value.

@lakhmanisahil
Copy link
Author

One more thing, your commits need to be organised. You should have just four commits,

* "web: xxx"

* "builder: yyy"

* "build_manager: zzz"

* "docker-compose.yml: abc"

Where xxx, yyy, zzz and abc are descriptions about what your commit does to those modules.

https://ardupilot.org/dev/docs/git-interactive-rebase.html Use this wiki for help.

Thank You @shiv-tyagi , i will look into and update it soon.

@lakhmanisahil lakhmanisahil force-pushed the feat/add-build-timeout branch 2 times, most recently from 7a2e64b to d10ba58 Compare January 20, 2026 12:27
@lakhmanisahil lakhmanisahil force-pushed the feat/add-build-timeout branch from 2cbae25 to 02b8f7b Compare January 20, 2026 21:07
@lakhmanisahil
Copy link
Author

Hi @shiv-tyagi,

I've cleaned up the PR history. It now contains exactly 4 commits:

  1. web: add red color for TIMED_OUT state in UI
  2. builder: add timeout handling using CBS_BUILD_TIMEOUT_SEC
  3. build_manager: track time_started and handle timeout state
  4. docker-compose: add CBS_BUILD_TIMEOUT_SEC env var

Changes summary:

  • Added TIMED_OUT state to BuildState enum
  • Added time_started tracking in BuildInfo
  • Created update_build_time_started() method to persist start time to Redis
  • Implemented timeout detection in __refresh_running_build_state()
  • Added timeout parameter to subprocess calls in builder
  • Display TIMED_OUT builds in red (danger) in web UI
  • Added CBS_BUILD_TIMEOUT_SEC environment variable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce a Timed-Out state for builds

2 participants