- Overview
- Prerequisites
- Audience
- Audience
- Curriculum
Description:
This course will provide an introduction to statistical analysis and data modeling. We will learn the fundamentals of descriptive and inferential statistics and their application. We will also explore the analysis of time series data and modeling business problems. Further, the application of statistical learning models, such as linear and logistic regression, will be examined. Learners will be introduced to the R programming language and will perform data exploration, analysis, modeling and visualization using RStudio.
Course Code/Duration:
BDT154 / 2 Full Days Or 4 Half Days
Learning Objectives:
After this course, you will be able to:
- Install R and RStudio on a personal computer.
- Understand the basics of Descriptive and Inferential Statistics
- Perform exploratory data analysis.
- Use R for data visualization.
- Effectively clean and prepare data for analysis.
- Perform hypothesis testing.
- Utilize Linear Regression for prediction.
- Utilize Logistic Regression for classification.
- Evaluate models and choose the most effective one.
- Understand how to interpret a Confusion Matrix.
- Use R for statistical analysis.
- Perform trend analysis.
- Use time series data for forecasting.
- Understand project workflow for model building.
- Solidify your understanding of statistical analysis and model building by completing hands-on exercises and milestones.
- Understand the importance of statistical analysis and forecasting in business.
- Basic programming
- This course is for learners who would like to become familiar with Jupyter Notebook and to max-imize its use for data analysis and for project organization and collaboration.
- This course is for learners who would like to become familiar with Jupyter Notebook and to max-imize its use for data analysis and for project organization and collaboration.
Course Outline
Day 1:
- Course Introduction
- Overview of Statistical Analysis
- Descriptive Statistics Fundamentals
- Central Tendency
- Mean
- Median
- Mode
- Spread of the Data
- Variance
- Standard Deviation
- Range
- Relative Standing
- Percentile
- Quartile
- Interquartile Range
- Installing R and RStudio
- Introduction to R Programming
- Numbers and Arithmetic Operators
- Variables
- Data Types
- Lists
- Vectors
- Matrices and Arrays
- Data Frames
- If Statements
- Loops
- Functions
- Installing Packages in R
- Accessing Data
- Reading Data From a File
- Writing Data to a File
- Working with Relational Databases
- Data Exploration
- Statistical Summaries of Data
- Grouping Data
- Feature Selection
- Milestone 1: Performexploratory data analysis
- Data Collection and Management
- Manipulating Data
- Data Cleaning and Preparation
- Dropping Rows
- Imputing Missing Values
- Feature Evaluating
- Data Transformation
- One-Hot Encoding
- Standardization
- Normalization
- Milestone 2: Performdata preparation and transformation
- Central Tendency
Day 2:
- Statistical Analysis Process
- Data Visualization
- ggplot2
- Line Chart
- Bar Chart
- Histogram
- Scatterplot
- Box Plot
- Milestone 3: Data visualization in R
- Inferential Statistics Fundamentals
- Normal Distribution
- Central Limit Theorem
- Standard Error
- Confidence Intervals
- Samples
- Hypothesis Testing
- Type I and Type II Errors
- Significance Testing
- P-value
- Z-score
- T-test
- Statistical Tests on Data
- Milestone 4: Inference and hypothesis testing
Day 3:
- Statistical Learning
- Linear Models
- Linear Regression Analysis
- Correlation vs. Regression
- Making Predictions
- Interpreting Regression
- R-squared
- Milestone 5: Build and interpret a linear regression model
- Classification Models
- Logistic Regression Analysis
- Understanding Logistic Regression
- Making Predictions
- Model Evaluation
- Accuracy
- Confusion Matrix
- Precision
- Recall
- Milestone 6: Build a logistic regression model and make classifications
Day 4:
- Time Series Analysis
- Handling Dates and Time
- Wrangling Time Series Data
- Selecting Features for a Time Series
- Forecasting
- Trend Analysis in R
- Statistical Models for Time Series
- Autoregressive Models
- Moving Average Models
- Milestone 7: Build a time series model
- Practical Approaches to Model Building
- Preparation and Organization
- Project Workflow End to End
- Model Maintenance
- Milestone 8: Build, evaluate, and export a statistical model
- Conclusion: Statistical analysis in business, next steps
- Note: Day wise agenda is provided for guidance and the actual pace of the class would depend on the natural pace of the group.
Structured Activity/Exercises/Case Studies:
- Milestone 1: Performexploratory data analysis
- Milestone 2: Performdata preparation and transformation
- Milestone 3: Data visualization in R
- Milestone 4: Inference and hypothesis testing
- Milestone 5: Build a linear regression model
- Milestone 6: Build a logistic regression model
- Milestone 7: Build a time series model
- Milestone 8: Build, evaluate, and export a statistical model
Training material provided: Yes (Digital format)
The curriculum is empty
[INSERT_ELEMENTOR id="19900"]