- Overview
- Prerequisites
- Audience
- Audience
- Curriculum
Description:
This Lecture provides a non-intimidating introduction to Big Data Hadoop and Spark. We will get behind the scenes to understand the secret sauce of the success of Hadoop and other Big Data technologies.
Long Description:
Embark on a Journey to Explore the World of Big Data with our comprehensive training course! Delve into the fascinating history and background of Big Data while gaining a strong introduction to essential Big Data Ecosystem technologies, including HDFS, MapReduce, Sqoop, Flume, Hive, Pig, Mahout (Machine Learning), R Connector, Ambari, Zookeeper, Oozie, and No-SQL solutions like HBase. This course will provide you with a thorough understanding of the Big Data ecosystem both before and after the advent of Apache Spark. Join us to witness a live demonstration of Big Data analysis using Apache Spark, equipping you with the knowledge and skills to navigate the vast landscape of Big Data with confidence. Enroll today and start your journey of discovery into the world of Big Data!
Course Code/Duration:
BDT39 / 1 Day
Learning Objectives:
After this course, you will be able to:
- Understand the History and background of Big data and Hadoop
- Describe the Big Data landscape including examples of real world big data problems
- Explain the 5 V’s of Big Data (volume, velocity, variety, veracity, and value)
- Understand the foundational principles that have made Big Data so successful.
- Provide an explanation of the ecosystem components like HDFS, MapReduce, Sqoop, Flume, Hive, Pig, Mahout (Machine Learning), R Connector, Ambari, Zookeeper, Oozie and No-SQL like HBase.
- Understand the various offerings like Cloudera, Hortonworks, MapR, Amazon EMR and Microsoft Azure HDInsight in the industry around Big data on cloud and on Premise.
- Understand the impact and value of Apache Spark in the Big Data Ecosystem.
- Basic Programming knowledge
- Data Analyst, SQL Developers, Database Administrator, Database Developers, Aspiring Data Warehouse Professionals
- Data Analyst, SQL Developers, Database Administrator, Database Developers, Aspiring Data Warehouse Professionals
Course Outline:
- Course Introduction
- History and background of Big Data and Hadoop
- 5 V’s of Big Data
- Secret Sauce of Big Data Hadoop
- Big Data Distributions in Industry
- Big Data Ecosystem before Apache Spark
- Big Data Ecosystem after Apache Spark
- Comparison of MapReduce Vs Apache Spark
- Big Data Ecosystem after Apache Spark
- Understand Apache Architecture and Libraries like Streaming, Machine & Deep Learning, GraphX etc.
- Demo 1 – Data Analysis using Apache Spark Databricks Cloud.
- References and Next steps
Structured Activity/Exercises/Case Studies:
- Demo 1 – Data Analysis using Apache Spark Databricks Cloud.
Training material provided:
Yes (Digital format)