- Overview
- Prerequisites
- Audience
- Audience
- Curriculum
Description:
Enhance your Apache Airflow proficiency through our 'Advanced Apache Airflow' course. Designed for experienced data engineers, this program covers Airflow 2 updates and dives into advanced topics, including connections, DAG creation, security, Kubernetes, and scaling. This hands-on course (70%) combines practical exercises with informative lectures, demos, and discussions (30%). It's conducted using Python versions exceeding 3.5 and Airflow versions surpassing 2.1. Join us to take your Airflow skills to the next level and excel in data engineering and workflow orchestration.
Duration: 3 Days
Course Code: BDT292
Learning Objectives:
After this course, you will be able to:
- Create production-ready data pipelines in Airflow that are able to scale to hundreds of tasks
- Enforce modularization and reusability of Airflow Tasks across projects
- Scale Airflow in Kubernetes
- Python Programming
- Basic Understanding of Workflow Management
- People being curious about data engineering.
- People who want to learn basic and advanced concepts about Apache Airflow.
- People who like hands-on approach.
- People being curious about data engineering.
- People who want to learn basic and advanced concepts about Apache Airflow.
- People who like hands-on approach.
Course Outline:
Introducing Apache Airflow
- What Airflow is and what does it solve?
- Airflow architecture
- How do we represent a Pipeline?
- Demo: Our first DAG
- Tasks, TaskFlow and Operators
- Demo: First Pipeline
- Capstone Lab
Mastering scheduling
- execution_date, start_date and schedule_interval
- Handling non-default schedule_intervals
- Demo: Playing with time
- Capstone Lab
Abstracting functionality
- Using custom operators
- Creating TaskGroups vs subDAGs
- Sharing data with xCOMs
- Branching and Triggers
- Sensors and SmartSensors
- Capstone Lab
Executors and Scaling Airflow
- Abandoning SQLite to PostgreSQL
- Executors: Debug, Local, Celery
- Concurrency and parallelism
- Demo: Concurrency with Celery
- Airflow in Kubernetes, the old and new ways
- KEDA and HA scheduler
- Demo: Deploying a highly availability fault-tolerant Airflow
Hackathon
Training material provided: Yes (Digital format)