08. Airflow for AI
Chapter 8 of 24 · 15 min
Apache Airflow is the dominant workflow orchestrator, with strong ML adoption. It defines workflows as Python code, executes tasks on defined schedules or triggers, and provides a UI for monitoring.
Installation for local development:
pip install apache-airflow
# Initialize database
airflow db init
# Create admin user
airflow users create \
--username admin \
--firstname Admin \
--lastname User \
--role Admin \
--email [email protected]
# Start webserver (background)
airflow webserver --port 8080 &
# Start scheduler (background)
airflow scheduler &
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
EXERCISE
Install Airflow locally. Create the DAG above. Run it in local executor mode (airflow dags test daily-model-training 2024-01-01). Verify tasks execute and check the UI for status.