Skip to content

UVADS/workflow-basics

Repository files navigation

Workflow Basics

Learn about linking data engineering and data analytics tasks into scalable and reliable pipelines.

What's a Workflow?

A workflow is a sequence of tasks executed in order with dependencies to accomplish a larger goal. These tutorials focus on workflow orchestration tools for data engineering and analytics that automate, schedule, and manage data processing pipelines—from simple ETL to complex machine learning tasks.

We distinguish:

  • Workflows related to your Office Productivity tools (not covered here)
  • Workflows for DevOps (CI/CD)
  • Workflows for Data Engineering and Data Analytics

Getting Started

Clone this repository and explore the docs:

git clone https://github.com/UVADS/workflow-basics.git
cd workflow-basics

Contents

  • Introduction
    • Orchestration
    • Choosing a tool
    • Monitoring
    • Deployment
    • Persistence
    • Resilience
    • Reproducibility
    • Portability & Sharing
  • Quickstart
    • Airflow
    • Nextflow
    • Snakemake
    • Prefect
    • Dagster
    • Targets
  • Examples
    • Airflow Examples
    • Nextflow Examples
    • Prefect Examples
    • Targets Examples

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

Distributed under the MIT License.

About

No description, website, or topics provided.

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published