- Overview
- Prerequisites
- Pre-Requist
- Audience
- Curriculum
Description:
Join our 10-week Data Science Bootcamp and embark on a journey to master data science, machine learning, AI, and cloud computing with Google Cloud Platform (GCP). Starting with Python programming fundamentals, you'll delve into key Python libraries such as NumPy, Pandas, Matplotlib, and Scikit-learn, essential for data science. Explore data, analyze, model, and visualize as part of the data science process.
Venture into the world of AI, gaining a solid understanding of neural networks, deep learning, and practical use cases like image recognition and language understanding using TensorFlow.
Harness the power of Google Cloud Platform to apply AI and machine learning to real-world datasets. Experience hands-on learning and design robust end-to-end projects. This immersive bootcamp equips you with the skills to excel in data science, AI, and machine learning, making you industry-ready."
Duration: 10 Weeks
Course Code: BDT202
Learning Objectives:
- Master Python programming fundamentals, including data structures and loops.
- Comprehend the end-to-end data science process, from problem definition to results communication.
- Explore and apply descriptive and inferential statistics, including hypothesis testing.
- Develop crucial data preprocessing skills like cleaning and feature engineering.
- Learn effective data visualization techniques using libraries like Matplotlib.
- Attain proficiency in machine learning algorithms, particularly linear and logistic regression.
- Gain expertise in model selection, tuning, and evaluation to enhance predictive performance.
- Understand big data concepts, Hadoop, and the impact of Apache Spark on data processing.
- Apply machine learning in Apache Spark, including linear regression and K-means clustering.
- Acquire practical knowledge in deep learning, natural language processing, and computer vision using TensorFlow.
Prerequisites:
Should include one or more years of business and/or tech experience and familiarity with the following technologies:
- Understanding of how computers work
- Basic programming experience with Python.
Audience:
- Ideal candidates can be a computer science degree or equivalent experience and entering their first IT role with the focus of Data Science and AI.
Course Outline:
Python Programming – Fundamentals
- Set Up
- Set up development environment – Jupyter notebooks
- Using python shell
- Running python script
- Understanding python strings
- Print statements in python
- Data Structures in python
- Integers
- Lists
- Dictionaries
- Tuple
- File
- Mutable and Immutable structures
- Selection and Looping Constructs
- If/else/elif statements
- Boolean type
- “in” membership
- For loop
- While Loop
- List and Dictionary Comprehension
- Functions
- Defining functions
- Variable scope – Local and Global
- Arguments
- Polymorphisms
- Modules
- Creating modules
- Importing Modules
- Different types of imports
- Dir and help
- Examining some built-in modules
- Classes Exceptions
- Object Oriented Programming Introduction
- Classes and Objects
- Polymorphism – Function and Operator Overloading
- Inheritance
Data Science&rdquo
- Overview of Data Science
- The Difference Between Business Analytics (BI), Data Analytics and Data Science
- The Field of Data Science
- The Data Science Process
- Define the Problem
- Get the Data
- Explore the Data
- Clean the Data
- Model the Data
- Communicate the Findings
- Identifying a problem and asking good questions
Descriptive Statistics Fundamentals
- Central Tendency
- Mean
- Median
- Mode
- Spread of the Data
- Variance
- Standard Deviation
- Range
- Relative Standing
- Percentile
- Quartile
- Inter-quartile Range
Inferential Statistics Fundamentals
- Inferential Statistics
- Normal Distribution
- Central Limit Theorem
- Standard Error
- Confidence Intervals
- Other Distributions
- Samples
- Hypothesis Testing
- Perform statistical analysis on a given data set.
Data Exploration and Preparation
- Data Exploration
- Describe
- Merging
- Grouping
- Evaluating Features
- Data Visualization
- Line Chart
- Scatterplot
- Pairplot
- Histogram
- Density Plot
- Bar Chart
- Boxplot
- Customizing Charts
- Perform Exploratory Data Analysis
- Data Cleaning
- Dropping Rows
- Imputing Missing Values
- Feature Evaluating
- Feature Engineering
- Data Transformation
- One-Hot Encoding
- Standardization
- Normalization
- Test/Train Split
- Model Training
Machine Learning Overview
- History and Background of AI and ML
- Compare AI vs ML vs DL
- Describe Supervised and Unsupervised learning techniques and usages
- Machine Learning patterns
- Classification
- Clustering
- Regression
- Gartner Hype Cycle for Emerging Technologies
- Machine Learning offerings in Industry
- Discuss Machine Learning use cases in different domains
- Understand the Data Science process to apply to ML use cases
- Understand the relation between Data Engineering and Data Science
- Identify the different roles needed for successful ML project
Essential Python Data Science Libraries
- Numpy
- Pandas
- Matplotlib
- Scikit-learn
- Statsmodels
Machine Learning Algorithms
- Linear Regression
- Logistic Regression
- Support Vector Machine
- Decision Tree
- K-Means
- Clustering
Model selection and tuning
- Machine learning
- K-fold cross validation
- Box plot
- Measuring performance
- Model selection
- Refining the model
- Hyperparameter tuning
- Grid search
- Apply machine learning algorithms, select and refine the best model.
Big Data Overview
- History and background of Big Data and Hadoop
- 5 V’s of Big Data
- Secret Sauce of Big Data Hadoop
- Big Data Distributions in Industry
- Big Data Ecosystem before Apache Spark
- Big Data Ecosystem after Apache Spark
- Comparison of MapReduce Vs Apache Spark
- Big Data Ecosystem after Apache Spark
- Understand Apache Architecture and Libraries like Streaming, Machine & Deep Learning, GraphX etc.
Machine learning using Apache Spark
- Spark ML Overview
- Introduction to Jupyter notebooks
- Lab: Working with Jupyter + Python + Spark
- Lab: Spark ML utilities
- Simple Linear Regression
- Multiple Linear Regression
- Running LR
- Evaluating LR model performance
- Use case: House price estimates
- Theory behind K-Means
- Running K-Means algorithm
- Estimating the performance
- Use case: grouping
Neural Networks and TensorFlow
- Introduction to neural networks
- The math behind neural networks
- Activation functions
- Vanishing gradient problem and ReLU
- Loss functions
- Gradient descent
- Back propagation
- Understanding the intuition behind neural networks
- Introducing Perceptrons
- Single Layer linear classifier
- Step Function
- Updating the weights
- Linear separability and XOR problem
- Hidden Layers: Intro to Deep Neural Networks and Deep Learning
- Hidden Layers as a solution to XOR problem
- The architecture of deep learning
- Introducing Keras/TensorFlow
- What is Keras?
- Using Keras with a TensorFlow Backend
- Lab: Using Keras to implement a neural network
- Introducing TensorFlow
- TensorFlow intro
- TensorFlow Features
- TensorFlow Versions
- GPU and TPU scalability
- Lab: Setting up and Running TensorFlow
- The Tensor: The Basic Unit of TensorFlow
- Introducing Tensors
- TensorFlow Execution Model
- Lab: Learning about Tensors
Image Recognition
Convolutional Neural Networks in Keras/TensorFlow
- Introducing CNNs
- Convolution layer
- Pooling layer
- Fully connected layer
- CNNs in TensorFlow
- Lab: Image recognition
Image and Video Processing
- Image processing elements
- Convolutions
- Pooling
- Edge Detection
- De-noising
- Video Analysis
- Understanding Video
- OpenCV and Video
- Capturing Video from a Camera
- Using Video Files
- Optical Flow and Motion Estimation
- Deep Learning in Optical Flow Estimation
- Visual Object Tracking
Natural Language Processing
- What is NLP?
- Sensory Acuity
- Behavioral Flexibility
- NLP Techniques
- NLP and Deep Learning
- Word2vec
- Learning word embedding
- The Skip-gram Model
- Building the graph
- Training the model
- Visualizing the embeddings
- Optimizing the implementation
- Text classification with TensorFlow
- Automatic translation (seq2seq)
- Text generation with RNN
- Named entity extraction with RNNs (sequence modeling)
- Bidirectional LSTM with attention
- Natural Language Processing pipelines
- Conversational AI
- Introduction to the Rasa framework
- Generating natural language
- Understanding natural language
- Chatbots
Machine Learning and AI on the Cloud
Getting Started with GCP
- Security with Google’s Cloud Infrastructure
- Understanding resource hierarchy
- IAM – Identity and Access Management
- Different IAM Roles
- Connecting to Google Cloud Platform
Machine Learning on GCP
- Understanding AI and Machine Learning
- Doing machine learning on GCP
- Understanding TensorFlow
Machine Learning APIs on GCP
- Machine Learning APIs
- Natural Language Processing
- Translation API
- Speech – Text to Speech, Speech to Text API
- Vision API
- Cloud AutoML
- BigQuery ML
Certification
- Certification Overview
- Identify the right certification for you
- Tips to prepare for certification
Project Case
- Project Overview
- Performing Projects to get experience and practice
- ML and AI Case Studies
Training material provided:
Yes (Digital format)
Hands-on Lab: Instructions will be provided to create a free tier account on Azure.
- Set Up