Airflow Run Only One Instance Of a DAG At a Time
Imagine you have a DAG scheduled to run every 5 minutes:
with DAG(dag_id='sample_dag', schedule='*/5 * * * *', ...) as dag:
...
Once in a while, a DAG Run may take more than 5 minutes causing the next DAG Run of the same DAG to trigger in parallel, i.e., you’ll end up having two DAG runs of the same DAG running parallelly/concurrently at the same time.
But you want to ensure that only one instance of the DAG can run at any given point in time. Not more.
Use max_active_runs
DAG parameter to achieve that:
with DAG(dag_id='sample_dag', schedule='*/5 * * * *', max_active_runs=1, ...) as dag:
...
# or when using the @dag decorator
@dag(
schedule='*/5 * * * *',
max_active_runs=1,
)
def sample_dag():
...
max_active_runs=1
will ensure that Airflow schedules any subsequent DAG Run only after the end of the previous running instance in cases of overlaps.