Skip to content

Conversation

@PeterNerlich
Copy link

Linkcheck is spawning off a new thread to handle finding links in new/updated content. This thread uses the same database cursor as the main thread, which can result in a deadlock (#208).
One can abuse the boolean linkcheck.listeners.tests_running to force this to run synchronously, but this is not feasible when more than a few objects are to be checked.

As proposed in that issue, we implement a third path to do this: Giving the task to a Celery worker. This is an entirely separate process that has its own dedicated database cursor and should not suffer from such deadlocks.
It also means that we cannot just pass djangos python proxy object of the database entry but need to reference it by app and model name and ID to submit it through the broker.

Proposed changes in detail:

  • Move the core routines to update data from listeners.py to worker_tasks.py
    • rename instance_pre_delete()do_instance_pre_delete() for consistency
  • Add a setting LINKCHECK_IN_CELERY
  • Define shared tasks in celery.py accepting the model and id for the instance to be checked
  • When LINKCHECK_IN_CELERY is truthy, import celery.py in listeners.py. This means that otherwise, nothing from celery is imported, preventing celery from becoming a hard dependency of django-linkcheck
  • When LINKCHECK_IN_CELERY is truthy, don't spawn a new thread to run any linkcheck tasks, but submit them to celery
    Since arguments for celery tasks have to be primitive types (not python objects representing database entries) we have to juggle around a bit and find out the app and model name of the object to send along with the id, and on the other side use it to fetch the object again to hand to the function doing the actual work

@claudep
Copy link
Contributor

claudep commented Dec 12, 2025

Thanks for the proposal. That's interesting, but to be frank, I think I'd prefer implementing the newly-added task framework in Django 6.0 (https://docs.djangoproject.com/en/6.0/topics/tasks/). I don't know the current integration status between this new framework and the Celery world, but I'm sure that we will soon see working integration code in the Django ecosystem.

Meanwhile, you may keep your code in a fork, and we will surely find in it inspiration to integrate with Django tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants