Skip to content

Runs lost if fetch:credential fails #1177

@josephjclark

Description

@josephjclark

Seeing a case in prod where credentials fail to load, for some reason

Logs like this:

[SRV] ✘ 3355001b-3ed7-4830-b6e7-42312b77f405 :: fetch:credential :: ERR: [fetch:credential] timeout
[SRV] ✘ 3355001b-3ed7-4830-b6e7-42312b77f405 :: fetch:credential :: ERR: [fetch:credential] timeout
[SRV] ✘ 3355001b-3ed7-4830-b6e7-42312b77f405 :: fetch:credential :: ERR: [fetch:credential] timeout
[SRV] ✘ 3355001b-3ed7-4830-b6e7-42312b77f405 :: fetch:credential :: ERR: [fetch:credential] timeout
[SRV] ✘ 3355001b-3ed7-4830-b6e7-42312b77f405 :: fetch:credential :: ERR: [fetch:credential] timeout
[SRV] ✘ 3355001b-3ed7-4830-b6e7-42312b77f405 :: run:log :: ERR: [run:log] timeout
INFO 2025-12-15T19:40:01.328077973Z [resource.labels.containerName: global-web] [warning] Detected lost run with reason LostAfterClaim:
[SRV] ❯ Critical error in channel run:3355001b-3ed7-4830-b6e7-42312b77f405 [
INFO 2025-12-16T00:42:14.683553324Z [resource.labels.containerName: global-web] [info] REFUSED JOIN run:3355001b-3ed7-4830-b6e7-42312b77f405 in 7ms

I think the lost is probably caused by a low grace period on the plan. But if the credential fails to load, really this run should return with an exception. So why didn't it?

Example run: GCP

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Ready

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions