- Overview
- Prerequisites
- Audience
- Curriculum
Description:
This 3-day instructor-led course introduces participants to the core concepts of Artificial Intelligence and Deep Learning using the Apache Spark ecosystem. The training begins with foundational AI concepts, progresses to practical deep learning techniques, and culminates in distributed model training using Spark MLlib and third-party integrations such as Tensor Flow On Spark and Elephas. Through hands-on labs and real-world scenarios, participants will learn how to process large datasets, train and evaluate deep learning models in a distributed setting, and deploy them for scalable inference. By the end of the course, learners will be equipped to leverage Spark for end-to-end deep learning workflows across industries.
Duration: 3 Days
Course Code: BDT45
Learning Objectives:
After this training, participants will be able to:
- Explain the fundamentals of AI and Deep Learning within the context of big data.
- Compare various deep learning frameworks and their integration with Apache Spark.
- Construct and train distributed deep learning models using Spark MLlib and external libraries. Evaluate the performance and scalability of deep learning models on Spark.
- Deploy trained models for inference in a distributed production environment.
- Basic understanding of Python and machine learning concepts
- Familiarity with Apache Spark fundamentals
- Prior experience with data pipelines is helpful but not required
- Data scientists and engineers working with big data frameworks
- AI/ML professionals interested in scalable deep learning
- Developers seeking to integrate AI models in distributed environments
- Analysts and architects exploring deep learning pipelines on Spark
Course Outline:
Module 1: Foundations of AI and Deep Learning
- Introduction to Artificial Intelligence and Deep Learning
- Key components of neural networks
- Overview of big data and Apache Spark architecture
- Spark MLlib for machine learning and deep learning
- Comparing TensorFlow, PyTorch, and Keras with Spark integrations
Module 2: Distributed Deep Learning with Apache Spark
- Deep learning workflows with Spark
- Using MLlib for classification, regression, and clustering
- Introduction to TensorFlowOnSpark
- Hands-on: Building a neural network using Spark and Keras
- Hands-on: Distributed training and tuning hyperparameters
Module 3: Scalable Deployment and Use Cases
- Model evaluation and performance metrics
- Exporting and saving models in Spark pipelines
- Hands-on: Inference at scale with distributed models
- Use case walkthroughs: Fraud detection, sentiment analysis, and image classification
- Best practices for deploying AI models in production Spark clusters