Skip to content

Comments

feat: runner leader election — ctl-api + SDK#410

Draft
RealHarshThakur wants to merge 13 commits intomainfrom
ht/leader-sdk
Draft

feat: runner leader election — ctl-api + SDK#410
RealHarshThakur wants to merge 13 commits intomainfrom
ht/leader-sdk

Conversation

@RealHarshThakur
Copy link
Contributor

@RealHarshThakur RealHarshThakur commented Feb 19, 2026

Runner Leader Election

Problem

When multiple runners exist in a RunnerGroup, all of them process jobs concurrently. There's no coordination to ensure only one runner handles work at a time, which can cause duplicate processing and unpredictable behavior.

We want to have leader election to solve for:

  • Failover of runner.
  • Manually picking the runner, useful for dev & potentially other use cases.

Solution

In this PR, I've focused on the automatic hands off experience to pick up the local dev runner.
The control plane (ctl-api) now elects a single leader runner per group. Only the leader processes jobs — all other runners sit in quiet standby.

How it works

Data model

A single leader_runner_id column on runner_groups (nullable FK to runners.id). NULL means no healthy runner is available.

Taints

We also introduced the concept of "Taints" for a smooth dev workflow as when we're local , we not only want the local runner to be picked but also on restarts of the runner, don't want jobs to be picked by the cloud runner. Tainting a Runner ensures that doesn't happen.

Election logic

Picks the oldest active runner by created_at (deterministic, avoids churn). There's an event loop that runs every minute to see if the leader should re-elected.

We've docs in a separate PR but might help in this PR's context to go through them.

Implements runner leader election across the control API and SDK layers.

Stack:

@RealHarshThakur RealHarshThakur marked this pull request as draft February 19, 2026 16:46
@github-actions
Copy link

This PR was marked as stale, and will be closed after 3 more days. Add the #keep-open label to prevent this from being closed.

@github-actions github-actions bot added the stale label Feb 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants