Instructor

raju2006

Introduction to Big Data Spark

10 weeks

All levels

0 lessons

0 quizzes

0 students

Introduction to Big Data Spark

Created By raju2006
Last Updated February 16th, 2025

Overview
Prerequisite
Audience
Curriculum

Description:

The 'Introduction to Big Data' course is your gateway to the dynamic world of Big Data and Spark. Dive into the history and fundamentals of Big Data, gaining insights into Big Data Ecosystem technologies, including HDFS, MapReduce, Sqoop, Flume, Hive, Pig, Mahout for Machine Learning, R Connector, Ambari, Zookeeper, Oozie, and No-SQL tools like HBase. This course offers an in-depth understanding of the Big Data ecosystem, both pre and post Apache Spark era. Learn the core fundamentals and architecture of Spark and put your knowledge into practice on the Apache Spark Databricks Cloud. Get started on your Big Data journey.

Course Code/Duration:

BDT132 / 1 Day

Learning Objectives:

In this course, participants will:

Understand the History and background of Big data and Hadoop
Describe the Big Data landscape including examples of real world big data problems
Explain the 5 V’s of Big Data (volume, velocity, variety, veracity, and value)
Understand the foundational principles that have made Big Data so successful
Provide an explanation of the ecosystem components like HDFS, MapReduce, Sqoop, Flume, Hive, Pig, Mahout (Machine Learning), R Connector, Ambari, Zookeeper, Oozie and No-SQL like HBase
Understand the various offerings like Cloudera, Hortonworks, MapR, Amazon
EMR and Microsoft Azure HDInsight in the industry around Big data on cloud and on Premise
Understand the impact and value of Apache Spark in the Big Data Ecosystem
Understand the Apache Spark Architecture and the various libraries to perform various use cases like Streaming, Machine & Deep Learning, GraphX etc
Setup Account on Apache Spark Databricks Cloud
Perform hands-on activity on Big Data Ecosystem.

Basic Programming knowledge, SQL and Data knowledge preferred

This course is designed for anyone willing to develop a foundation for Big Data.

Course Outline:

The course includes presentations, demonstrations, and hands-on labs.

Course Introduction
History and background of Big Data and Hadoop
5 V’s of Big Data
Secret Sauce of Big Data Hadoop
Big Data Distributions in Industry
Big Data Ecosystem before Apache Spark
Big Data Ecosystem after Apache Spark
Comparison of MapReduce Vs Apache Spark
Big Data Ecosystem after Apache Spark
Understand Apache Architecture and Libraries like Streaming, Machine & Deep Learning, GraphX etc
- Hands-on exercise 1 – Setup Account on Apache Spark Databricks Cloud.
- Hands-on exercise 2 – First Spark Program
- Hands-on exercise 3 – Spark RDD Transformation & Actions
- Hands-on exercise 4 – Spark RDD Advanced Transformation & Actions

References and Next steps
Structured Activity/Exercises/Case Studies:
- Exercise 1 – Setup Account on Apache Spark Databricks Cloud.
- Exercise 2 – First Spark Program
- Exercise 3 – Spark RDD Transformation & Actions
- Exercise 4 – Spark RDD Advanced Transformation & Actions

Training material provided:

Yes (Digital format)

The curriculum is empty

raju2006

242 Courses

0.0 Avg Review

[INSERT_ELEMENTOR id="19900"]

Looking for Team Training?

Up-skill your team with a customized, private training

Public Classes

Suitable for small teams and individuals

Get Started

Join the Free 5-day AI LaunchPad course →

Achieve your goals

Achieve your goals

transform your life through education

Introduction to Big Data Spark

Introduction to Big Data Spark

Course Outline:

Training material provided:

raju2006

Looking for Team Training?

Public Classes

Get Started

AI/ML Byte-Sized Series: Unsupervised Learning

Probability & Statistics

Introduction to Power BI

Data Science for Leader

Byte-Sized ML Series: Data Exploration and Analysis

Headquarters

Quick Links

Resources

About Us

Newsletter

Follow us

Achieve your goals

Achieve your goals

transform your life through education

Introduction to Big Data Spark

Introduction to Big Data Spark

Course Outline:

Training material provided:

raju2006

Looking for Team Training?

Public Classes

Get Started

Related Courses

AI/ML Byte-Sized Series: Unsupervised Learning

Probability & Statistics

Introduction to Power BI

Data Science for Leader

Byte-Sized ML Series: Data Exploration and Analysis

Headquarters

Quick Links

Resources

About Us

Newsletter

Follow us

Modal title