Skip to content

Worker: uncaught exception when logs failed #1178

@josephjclark

Description

@josephjclark

We shouldn't be seeing this. The log should fail and life does on. I think that exception might be messing things up inside the worker, and may even be causing a sort of zombie process where the worker just sits there owning capacity.

This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:

LightningTimeoutError: [run:log] timeout
    at Object.callback (file:///app/packages/ws-worker/dist/start.js:717:16)
    at file:///app/node_modules/.pnpm/phoenix@1.7.10/node_modules/phoenix/priv/static/phoenix.mjs:109:71
    at Array.forEach (<anonymous>)
    at Push.matchReceive (file:///app/node_modules/.pnpm/phoenix@1.7.10/node_modules/phoenix/priv/static/phoenix.mjs:109:54)
    at Object.callback (file:///app/node_modules/.pnpm/phoenix@1.7.10/node_modules/phoenix/priv/static/phoenix.mjs:140:12)
    at Channel.trigger (file:///app/node_modules/.pnpm/phoenix@1.7.10/node_modules/phoenix/priv/static/phoenix.mjs:460:12)
    at Push.trigger (file:///app/node_modules/.pnpm/phoenix@1.7.10/node_modules/phoenix/priv/static/phoenix.mjs:156:18)
    at Timeout.<anonymous> (file:///app/node_modules/.pnpm/phoenix@1.7.10/node_modules/phoenix/priv/static/phoenix.mjs:143:12)
    at listOnTimeout (node:internal/timers:588:17)
    at process.processTimers (node:internal/timers:523:7)

GCP evidence 1

Note from the logs that this run seems to hang around for like 4 hours before it finally gives up, long after it was marked as lost.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions