Skip to content

Conversation

@josephjclark
Copy link
Collaborator

This PR attempts to fix an issue where

I can't reproduce the problem at all, and I don't really have much information to go on from logs.

I've done two things:

  • Adjusted where we catch errors in event processing. This may stop the uncaught exception appearing in the logs (it also may not!)
  • Add timeout to all messages, based on the server message timeout plus a grace period. This says: give the message N seconds to process, and if it doesn't, go and process another one.

The timeout thing would stop the zombie affect - at least it'll try and process all messages now.

Note that when/if we restore retries, we need to give each retry its own timeout (else retried event will just time themselves out)

Fixes #1178

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to know!):

  • Code generation (copilot but not intellisense)
  • Learning or fact checking
  • Strategy / design
  • Optimisation / refactoring
  • Translation / spellchecking / doc gen
  • Other
  • I have not used AI

You can read more details in our Responsible AI Policy

@github-project-automation github-project-automation bot moved this to New Issues in v2 Dec 18, 2025
@josephjclark josephjclark marked this pull request as ready for review December 18, 2025 16:19
@josephjclark josephjclark merged commit 8ad490f into main Dec 18, 2025
6 checks passed
@github-project-automation github-project-automation bot moved this from New Issues to Done in v2 Dec 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Worker: uncaught exception when logs failed

2 participants