Member-only story

Managing dependencies between data pipelines in Apache Airflow & Prefect

A simple approach to managing dependencies between your workflows

Anna Geller
Towards Data Science
7 min readSep 4, 2020

Photo by Kelly Sikkema on Unsplash

If you ever built data pipelines for co-dependent business processes, you might have noticed that incorporating all of your company’s business logic into one single workflow does not work well and quickly turns into a maintenance nightmare. Many workflow scheduling systems let us manage dependencies within a single data pipeline but they don’t support us in managing dependencies between workflows.

A natural way of resolving this problem would be to split a large pipeline into many smaller ones and coordinate the dependencies between them in some parent-child relationship. However, there are many possible ways of addressing this problem and I want to share one simple approach that worked well for me. I hope it might help you to manage dependencies between your data pipelines.

How Airflow community tried to tackle this problem

Within the book about Apache Airflow [1] created by two data engineers from GoDataDriven, there is a chapter on managing dependencies. This is how they summarized the issue:

“Airflow manages dependencies between tasks within one single DAG, however it does not

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Responses (1)

What are your thoughts?

Hey Anna Anisienia — good article. But just to tease the interest of yours and you readers — we just voted on quite a different concept that is coming in Airflow 2.0 which might be the best of both worlds. Since DAGs are Python code and cna be…

--