Airflow

../../../_images/airflow.png

Overview

Apache Airflow (or simply Airflow) is a platform to programmatically author, schedule, and monitor workflows.

When workflows are defined as code, they become more maintainable, versionable, testable, and collaborative.

Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command line utilities make performing complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed.

Highlights

  • Programmatic Workflow Definition - DAGs are defined using Python code, providing flexibility and enabling complex logic and dynamic workflows.

  • Scheduling - Airflow allows you to schedule workflows based on various criteria, such as time intervals (e.g., hourly, daily), specific dates, or external triggers.

  • Console - Airflow provides a web UI to visualize DAGs, monitor task execution, track progress, and view logs. This facilitates debugging and troubleshooting.

  • Extensibility - Airflow’s modular architecture allows you to extend its functionality by creating custom operators, sensors, and plugins.

  • Scalability - Airflow can be scaled horizontally to handle large and complex workflows by distributing tasks across multiple worker nodes.

  • Retries - Airflow automatically retries failed tasks, ensuring workflow resilience and minimizing manual intervention.

  • Alerts - Airflow can send alerts based on predefined conditions, such as task failures or SLA breaches.

Quickstart

  1. Start an instance with 1-Click, or optionally using your cloud provider’s web/console

  2. Have just a little patience: it does take a couple of minutes for all the background services to start up in your instance. If you get connection refused or site error messages - just wait a moment

  3. Access the product via web browser at https://<your IP/public DNS>

  4. Login with user-name admin, and the instance id as password.

  5. We recommend you change the admin password for the software via https://<your IP/public DNS>/users/userinfo/#

See also

Our Airflow Software

RPM Packages