Skip to content

Conversation

@chrisburr
Copy link
Member

@chrisburr chrisburr commented Dec 2, 2025

This PR contains the requirements we have for DiracX tasks and is intentionally separate from the implementation. Me at @chaen think this can be made to cover all of the use cases DiracX will have but we're open to feedback in case we missed something.

@read-the-docs-community
Copy link

read-the-docs-community bot commented Dec 2, 2025

Documentation build overview

📚 diracx | 🛠️ Build #30541404 | 📁 Comparing fb198da against latest (47c2b29)


🔍 Preview build

Show files changed (117 files in total): 📝 116 modified | ➕ 1 added | ➖ 0 deleted
File Status
404.html 📝 modified
index.html 📝 modified
REFERENCE/index.html 📝 modified
RUN_PROD/index.html 📝 modified
SECURITY/index.html 📝 modified
SSO/index.html 📝 modified
admin/index.html 📝 modified
dev/index.html 📝 modified
roadmap/index.html 📝 modified
user/index.html 📝 modified
admin/deploy_instance/index.html 📝 modified
admin/explanations/index.html 📝 modified
admin/how-to/index.html 📝 modified
admin/manage_dependencies/index.html 📝 modified
admin/manage_release/index.html 📝 modified
admin/reference/index.html 📝 modified
admin/tutorials/index.html 📝 modified
dev/explanations/index.html 📝 modified
dev/how-to/index.html 📝 modified
dev/manage_extension/index.html 📝 modified
dev/reference/index.html 📝 modified
dev/setup_environment/index.html 📝 modified
dev/tutorials/index.html 📝 modified
dev/web-arch/index.html 📝 modified
developer/contribute/index.html 📝 modified
developer/manage_extension/index.html 📝 modified
developer/setup_environment/index.html 📝 modified
user/explanations/index.html 📝 modified
user/how-to/index.html 📝 modified
user/reference/index.html 📝 modified
user/tutorials/index.html 📝 modified
user/web/index.html 📝 modified
admin/explanations/auth-with-diracx/index.html 📝 modified
admin/explanations/auth-with-external/index.html 📝 modified
admin/explanations/chart-structure/index.html 📝 modified
admin/explanations/configuration/index.html 📝 modified
admin/explanations/database-management/index.html 📝 modified
admin/explanations/opentelemetry/index.html 📝 modified
admin/explanations/sandbox-store/index.html 📝 modified
admin/explanations/user-management/index.html 📝 modified
admin/how-to/debugging/index.html 📝 modified
admin/how-to/install/index.html 📝 modified
admin/how-to/rotate-a-secret/index.html 📝 modified
admin/how-to/upgrading/index.html 📝 modified
admin/reference/env-variables/index.html 📝 modified
admin/reference/security_model/index.html 📝 modified
admin/reference/settings-and-preferences/index.html 📝 modified
admin/reference/values/index.html 📝 modified
admin/tutorials/authentication/index.html 📝 modified
admin/tutorials/run_locally/index.html 📝 modified
dev/explanations/components/index.html 📝 modified
dev/explanations/designing-functionality/index.html 📝 modified
dev/explanations/extensions/index.html 📝 modified
dev/explanations/run_demo/index.html 📝 modified
dev/explanations/tasks-architecture/index.html ➕ added
dev/explanations/testing/index.html 📝 modified
dev/how-to/add-a-cli-command/index.html 📝 modified
dev/how-to/add-a-db/index.html 📝 modified
dev/how-to/add-a-route/index.html 📝 modified
dev/how-to/add-a-setting/index.html 📝 modified
dev/how-to/add-a-task/index.html 📝 modified
dev/how-to/add-a-test/index.html 📝 modified
dev/how-to/add-functionality/index.html 📝 modified
dev/how-to/client-customization/index.html 📝 modified
dev/how-to/client-extension/index.html 📝 modified
dev/how-to/client-generation/index.html 📝 modified
dev/how-to/contribute/index.html 📝 modified
dev/how-to/create_application/index.html 📝 modified
dev/how-to/develop-legacy-dirac/index.html 📝 modified
dev/how-to/extend-diracx/index.html 📝 modified
dev/how-to/use-the-demo/index.html 📝 modified
dev/reference/application-state/index.html 📝 modified
dev/reference/client-metapathfinder/index.html 📝 modified
dev/reference/coding-conventions/index.html 📝 modified
dev/reference/configuration/index.html 📝 modified
dev/reference/db-transaction-model/index.html 📝 modified
dev/reference/dependency-injection/index.html 📝 modified
dev/reference/entrypoints/index.html 📝 modified
dev/reference/env-variables/index.html 📝 modified
dev/reference/pixi-tasks/index.html 📝 modified
dev/reference/security-policies/index.html 📝 modified
dev/reference/security-properties/index.html 📝 modified
dev/reference/test-recipes/index.html 📝 modified
dev/reference/writing-tests/index.html 📝 modified
dev/tutorials/advanced-tutorial/index.html 📝 modified
dev/tutorials/develop-web/index.html 📝 modified
dev/tutorials/getting-started/index.html 📝 modified
dev/tutorials/larger-developments/index.html 📝 modified
dev/tutorials/making-changes/index.html 📝 modified
dev/tutorials/play-with-auth/index.html 📝 modified
dev/tutorials/run-locally/index.html 📝 modified
dev/tutorials/write-docs/index.html 📝 modified
user/reference/client-configuration/index.html 📝 modified
user/reference/known-installations/index.html 📝 modified
user/reference/programmatic-usage/index.html 📝 modified
user/tutorials/getting-started/index.html 📝 modified
user/web/list_and_share_applications/index.html 📝 modified
user/web/login_out/index.html 📝 modified
user/web/monitor_jobs/index.html 📝 modified
admin/how-to/install/connect/index.html 📝 modified
admin/how-to/install/convert-cs/index.html 📝 modified
admin/how-to/install/embracing/index.html 📝 modified
admin/how-to/install/install-kubernetes/index.html 📝 modified
admin/how-to/install/installing/index.html 📝 modified
admin/how-to/install/minimal-requirements/index.html 📝 modified
admin/how-to/install/register-a-vo/index.html 📝 modified
admin/how-to/install/register-the-admin-vo/index.html 📝 modified
dev/explanations/components/api/index.html 📝 modified
dev/explanations/components/cli/index.html 📝 modified
dev/explanations/components/client/index.html 📝 modified
dev/explanations/components/db/index.html 📝 modified
dev/explanations/components/routes/index.html 📝 modified
dev/how-to/use-the-demo/swagger/index.html 📝 modified
dev/how-to/use-the-demo/web/index.html 📝 modified
user/reference/programmatic-usage/command-line-interface/index.html 📝 modified
user/reference/programmatic-usage/https-interface/index.html 📝 modified
user/reference/programmatic-usage/python-interface/index.html 📝 modified

@chrisburr chrisburr force-pushed the docs/tasks-architecture branch from a0c73b0 to fb198da Compare December 2, 2025 20:17

### Sizing

Currently we foresee the need for three sizes of tasks:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan a task system to be aware of these categories?

Copy link
Contributor

@aldbr aldbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

- **Standalone tasks:** The task performs its work and then returns. For example synchronising the IAM to DiracX configuration queries IAM and then updates the DiracX CS.
- **Spawning tasks:** The task creates additional tasks. For example, many cron-triggered tasks for transformations will spawn a task per transformation.
- **Batching tasks:** This task groups together the work of many smaller tasks. For example, it is more efficient to clean many jobs at once so many individual cleaning tasks are batched together for execution.
- **Callback tasks:** These tasks run in response to several parent tasks finishing successfully. Removing a user from the configuration store after deleting all of the objects owned by them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure to understand the Callback tasks concept.
IIUC, you would have 2-3 independent tasks finishing successfully and that would spawn a "child" task, is that correct?

So the main difference between Spawning tasks and Callback tasks is:

  • A Spawning task spawns a new task by itself
  • A Callback task is spawned by the task queue system itself monitoring the tasks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally they're very similar, the difference is that the callback needs to make sure it only runs once.


Occasionally tasks must be scheduled to run at some point in the future, most commonly to retry a task after a delay. For example if a resource is banned corresponding tasks can use an exponential backoff strategy to avoid running excessively.

## What is a task?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion but you define triggering mechanism the same way as the type of tasks. For instance:

  • scheduled tasks
  • spawning tasks

I would potentially move the What is a task? section defining the type of tasks before the triggering mechanism to define:

  • standalone task
  • spawning task
  • batching task
  • callback task (I wonder if it's a type of task, it seems to be a triggering mechanism like reactive to me but I am not sure to understand it correctly)

Then I would explain how they can be triggered:

  • cron-based triggering
  • reactive triggering (include callback?)
  • scheduled triggering

- **Medium:** Medium memory, medium CPU. Single task per thread. For example, bulk submitting jobs for a transformation.
- **Large:** Large amounts of memory, multiple CPU cores. Single task per thread. For example, generating reporting data.

Tasks should be designed to be lightweight, especially in terms of memory usage, using techniques such as streaming database responses. The vast majority of tasks should be in the **Small** category. The **Large** tasks should be reserved for infrequent, time insensitive activity (e.g. reporting).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine the size of a task is defined statically and not dynamically based on input data for instance?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's the idea 🤞

## What is a task?

Tasks are async Python functions which have extremely low overhead, allowing for many tasks to be spawned for even cheap operations.
Tasks can be executed in four different ways:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these concepts mutually exclusive?
Conceptually:

  • it seems that a standalone task cannot be anything else.
  • A batching task could also be a spawning task (I don't see any use case but we might want to decide whether we allow such a combination).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're not mutually exclusive.

Comment on lines +61 to +63
- **Realtime:** Tasks which are expected to be executed immediately. If there is ever a backlog of tasks in this queue the available worker pool is likely undersized for the installation. Examples of realtime tasks would be user input in DiracX Web or running the job optimizers.
- **Normal:** Tasks which should generally run immediately but for which there is a less strict latency requirement. If there is regularly a backlog in this queue the available worker pool is likely undersized for the installation. Examples include submitting jobs for transformations or periodically polling external services.
- **Background:** Tasks which have no specific latency requirements. Having a backlog of tasks in response to operational activity is expected and not directly indicative of an issue provided the backlog is not growing without bound. Examples include pending requests for jobs or data management activities from the transformation system.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well here I am sure the priorities are mutually exclusive


Several levels of locking can be applied to support the reentrancy requirements of tasks:

- **Task level locking:** Tasks can be configured to prevent simultaneous execution. For example, most cron-style tasks are configured to ensure only a single instance is running at any given time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a given task tries to acquire a lock that is already held, I guess it follows the "soft failure" mode and is retried later, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes


## Broker

The state of the broker should be ephemeral and recreated with each update. Any persistent state should be stored in the standard MySQL database's used by DiracX. This requirement is imposed to:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor suggestion (because it could be another SQL implementation in the future + this document is focused on definitions/requirements rather than implementation).

Suggested change
The state of the broker should be ephemeral and recreated with each update. Any persistent state should be stored in the standard MySQL database's used by DiracX. This requirement is imposed to:
The state of the broker should be ephemeral and recreated with each update. Any persistent state should be stored in the standard SQL database's used by DiracX. This requirement is imposed to:

- Reduce the complexity of reasoning about updates which may change details of the broker's internal state.
- Improve performance by removing the need to ensure every action is flushed to persistent storage.

Upon first start, the broker is populated with the cron-style tasks as well as any pending reactive tasks that have been persisted in MySQL. The tasks which were persisted are those eligible for the dead letter queue.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here?

Suggested change
Upon first start, the broker is populated with the cron-style tasks as well as any pending reactive tasks that have been persisted in MySQL. The tasks which were persisted are those eligible for the dead letter queue.
Upon first start, the broker is populated with the cron-style tasks as well as any pending reactive tasks that have been persisted in SQL. The tasks which were persisted are those eligible for the dead letter queue.


### Resource utilisation

To ensure the stability of the system, all workers should be configured to enforce memory and CPU limits. If these limits are exceeded the DiracX task system must be able to detect and report the issue, however in the case of small workers such detection may be unreliable due to technical constraints in async environments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are workers organized into separate pools based on resource capacity?
Or do they have identical resource allocation sized for large tasks?

How do we plan to connect task queues to workers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants