- Overview
- Prerequisites
- Audience
- Audience
- Curriculum
Description:
The Apache Airflow Next Steps course is intended for either of two audiences: 1) data engineers that already work with Apache Airflow on a daily basis and want to understand the new changes that Airflow 2 brings; or 2) data engineers that have taken the Apache Airflow Fundamentals course and want to expand their knowledge of the advanced topics of Apache Airflow.
The course begins by revisiting topics that are often not taught in an Airflow fundamentals class, such as connections, variables, templating, and Pools. Then, we reformulate how we create DAGs under the new paradigms of Apache Airflow 2.x.
Next, we expand on a fundamental topic for any Airflow in Production: Security in Airflow. Finally, we teach Airflow in Kubernetes and how to scale within Kubernetes.
This course focuses on the practical aspects of how to use Airflow and is 70% hands-on, 30% lecture, demo, and discussion.
This Apache Airflow Next Steps course is taught using Python > 3.5 and Airflow > 2.1.
Duration: 3 Days
Course Code: BDT292
Learning Objectives:
After this course, you will be able to:
- Secure your Apache Airflow installation
- Create highly concurrent DAGs in Kubernetes
- Leverage most of the new functionality Airflow 2 brings
- All attendees must have prior Apache AIrflow experience, either from their own work or from Accelebrate’s Apache Airflow Fundamentals training.
- Data Engineers
- Data Scientists
- Python Developers Interested in Data Engineering
- Data Analysts with Python Programming Knowledge
- Data Engineers
- Data Scientists
- Python Developers Interested in Data Engineering
- Data Analysts with Python Programming Knowledge
Course Outline:
Creating DAGs the right way
- Secrets, connections and variables
- Demo: Creating connections on startup
- Using Pools for long running and demanding tasks
- Demo: Simulating long running tasks
- DAG serialization
- DAG versioning
- Testing DAGs
- Demo: CI/CD in Airflow
- Capstone Lab
Modularize your DAGs
- TaskGroups vs subDAGs
- TaskFlowAPI and XComs
- Demo: Modularizing
- Dynamic and Functional DAGs
- SmartSensors and timeouts
- Capstone Lab
Airflow Security
- RBAC in Airflow
- Setting up OAuth authentication
- Demo: Add Google oauth
- Adding SSL certs
- Default Roles and custom roles
- Demo: Creating a custom role
Airflow in Kubernetes
- The Helm chart
- Demo: Deploying Airflow with Helm
- Deploying single tasks to Kubernetes: KubernetesPodOperator
- Demo: Adding a task in Kubernetes
- Scaling Airflow with Kubernetes executor
- Demo: Changing the Helm charts values
- KEDA autoscaler
- Preparing DAGs for Kubernetes
- Demo: Creating a DAG fully in Kubernetes
- The CeleryKubernetes executor for extreme scalability
A note on upgrading from Airflow 1.10
Training material provided: Yes (Digital format)