The world’s leading publication for data science, AI, and ML professionals.

Airflow 2.7 Is Now Out

Here are the most important feature updates that will make your life easier and save you time

Image generated via DALL-E-2 using prompt "data flowing from external system into destination system as a system design graph, digital art"
Image generated via DALL-E-2 using prompt "data flowing from external system into destination system as a system design graph, digital art"

Apache Airflow 2.7.0 is finally out and we are all excited to see all these notable features being shipped in this latest release. The new version consists of 40 new features, 53 bug fixes, 49 improvements and 15 documentation updates.

The main focus of this release has been security but at the same time, many exciting non-security related features have also been made available.


The new Cluster Activity UI

As of Airflow 2.7.0, a new tab is introduced in the top level menu of the Airflow UI and is called Cluster Activity.

New Cluster Activity tab in main Airflow menu - Source: Author
New Cluster Activity tab in main Airflow menu – Source: Author

The new Cluster Activity UI gives an overview of the overall cluster state, including component health (for MetaDatabase, Scheduler, Triggerer and DAG processor) as well as details about DAG/Task run states and DAG run types.

The new Cluster Activity UI is shipped as part of the new Airflow 2.7.0 release - Source: Author
The new Cluster Activity UI is shipped as part of the new Airflow 2.7.0 release – Source: Author

See when the source code was last parsed

In the past I was really annoyed by the fact that I couldn’t really see whether the changes I was making to the source code of a particular DAG were actually parsed. I would usually have to refresh the page (multiple times) and start looking for the specific areas in the source code that I applied some changes in order to ensure that they have been parsed so I can re-trigger the DAG.

The new Airflow version introduces a Parsed at field within the Code tab of a DAG, indicating the timestamp when the DAG’s source code was last updated.

The new code tab includes a timestamp indicating when the DAG source code was last parsed - Source: Author
The new code tab includes a timestamp indicating when the DAG source code was last parsed – Source: Author

Simple but yet useful addition!


Keyboard shortcut support

Airflow Grid view now also supports keyboard shortcuts. Once you enter the Grid screen of a DAG, you will notice a note just underneath filtering section, indicating that you can access the list of supported shortcuts, by typing shift + / .

shift + / Shortcut will list all abailable shortcuts you can use in order to interact with Airflow DAGs and Tasks - Source: Author
shift + / Shortcut will list all abailable shortcuts you can use in order to interact with Airflow DAGs and Tasks – Source: Author

The actions you can take via keyboard shortcuts include clearing a DAG run and marking them as success or failed among others.

A full list of shortcuts for interacting with Airflow DAGs and Tasks - Source: Author
A full list of shortcuts for interacting with Airflow DAGs and Tasks – Source: Author

Graph and Gantt views all in one place

The Gantt and Graph views are now moved into the Grid view of DAGs such that it’s easier to navigate between task details, graphs, logs and Gantt views – especially when it comes to viewing more complex DAGs.

Gant and Graph views can now be found under Grid view - Source: Author
Gant and Graph views can now be found under Grid view – Source: Author

Note that the old graph view is also removed and the new Graph View is the default one.


Setup and Teardown tasks

When designing data pipelines, it’s common to create a resource that is used to perform a certain work and then tear it down. The new Airflow release introduced setup and teardown tasks that make this pattern feasible.

Let’s suppose we have a DAG that creates a compute resource, runs a query and finally tears down the previously created resource. Normally, we would have to create three tasks and specify the dependencies as follows:

create_resource >> run_query >> delete_resource

With the new features, we can now easily mark the first and last tasks as setup and teardown respectively:

create_resource.as_setup() >> run_query >> delete_resource.as_teardown()
create_resource >> delete_resource

# equivalent notation
create_resource >> run_query >> delete_resource.as_teardown(setups=create_resource)

Notes:

  • When run_query task is cleared, both create_resource (setup) and delete_resource (teardown) tasks will also be cleared
  • When run_query fails, the teardown task delete_resource will still run
  • create_resource and delete_resource state won’t be used to determing the success of a DAG run. This means that the DAG will be marked as success even if only run_query tasks is successful

Drops support for end-of-life Python 3.7

Additionally, Airflow 2.7.0 has dropped its support for the end-of-life Python 3.7. In order to make use of Airflow 2.7.0, you need to have one of the following Python versions:

  • 3.8
  • 3.9
  • 3.10
  • or, 3.11

Note that Python 3.7 is no longer supported by Python community. If you are still using it (even outside the context of Airflow), make sure to upgrade into a more recent version.


The airflow db migrate command

The db init and db upgrade commands are now deprecated. Instead, you should make use of the airflow db migrate command in order to create or upgrade Airflow database.

Likewise, load_default_connections configuration option is also deprecated. In order to create default connections, you need to run airflow connections create-default-commenctions command, after running airflow db migrate.


Handful of security updates

As mentioned already, the main focus of the latest Airflow release was to make it a bit more secure. Here are some security-related changes:

  • The test connection functionality is now disabled by default. If you still need to enable it, you will either specify the test_connection flag in core section of airflow.cfg or by setting up the environment variable AIRFLOW__CORE__TEST_CONNECTION
  • /dags/*/dagRuns/*/taskInstances/*/xcomEntries/* API endpoint now disables the deserialize option
  • The context now uses Python’s default default_ssl_contest context when working with SMTP SSL connections

Giving Airflow 2.7 a go

If you would like to test new features out I’d recommend doing so by running Airflow via Docker, on your local machine. You can find a step-by-step guide that can help you get it up and running in less than a minute in the link below.

How to Run Airflow Locally With Docker


Final Thoughts..

Keeping up with the latest Airflow versions ensures that you get access to the latest features as well as to any new security patches so that you won’t keep yourself awake at nights.

The very latest Airflow release comes with tons of new features, improvements, bug fixes, documentation updates and security patches. It is important to test it out and if possible, upgrade your production instances in order to make the most out of it while also enhancing security.

In this article, we covered only a small subset of the changes introduced in Airflow 2.7.0. You can see the full list of changes shipped as part of the latest release, in the release notes.


👇 Related articles you may also like 👇

How to Skip Tasks in Airflow DAGs


How To Run dbt on Airflow


Related Articles