- Overview
- Prerequisites
- Audience
- Audience
- Curriculum
Description:
Welcome to the Site Reliability Engineering Boot Camp, a comprehensive 10-week program designed to equip participants with the skills and knowledge necessary to excel in the field of Site Reliability Engineering (SRE).
This boot camp will cover a wide range of topics, including Agile Scrum, Linux and Bash Shell scripting, Python programming, SRE principles and practical examples, DevOps with Docker and Kubernetes, SQL programming with MySQL, using MongoDB, and working with Apache Kafka stream API. Each week, you will delve into a different aspect of SRE, gradually building a strong foundation and practical experience to succeed in this critical role.
This workshop will start with Agile Scrum Methodology since the whole workshop would be executed like an Agile project. Students will then be exposed to SQL Fundamentals.
Students will then learn the basics of Linux and Bash Shell scripting, Python programming Fundamentals, and best practices. For example, students will get hands-on experience using Python libraries to manipulate data.
The course will then advance to understanding what site reliability engineering (SRE) is. Understand principles involved in SRE, what is reliability, maintainability, and availability. What type of metrics can used in SRE.
Students will gain hands-on experience by working with DevOps where they will learn about the container technology such as Docker and the orchestration of the containers using Kubernetes. Students will learn about Continuous Integration and Continuous Development (CI/CD).
Students will then learn about the differences between relational and non-relational databases. They will get a basic understanding of different NoSQL database types. After which there will be an in depth look into Document Database: MongoDB where they will learn about CRUD operations on data.
Finally, students will learn about real-time data streaming with Apache Kafka Streams API. Students will explore the architecture of Kafka and key streaming concepts. We will build data stream pipelines and learn how to debug them.
Duration: 10.5 weeks
Course Code: BDT320
- Understanding of how computers work
- One or more years technical experience
- General Programming experience with Python
- Rudimentary knowledge of Networking Concepts.
- Candidates must have basic understanding of how computer systems work and general knowledge of networking concepts and computer programming.
- Candidates must have basic understanding of how computer systems work and general knowledge of networking concepts and computer programming.
Course Outline:
Professional Business Skills (3 days)
- Personal Development
- Personality Assessment (1 hour)
- Psychological Safety (1 hour)
- Growth Mindset (1 hour)
- Emotional Intelligence (1 hour)
- Crash Course: Productivity and Time Management (2 hours)
Culture and the Team
- Team vision, mission and values (1.5 hours)
- Managing team conflict (30 min)
- Celebrating failures (30 min)
- Meeting facilitation (1 hour)
Communication
- Written and Verbal Communication (1.5 hour)
- Asking Better Questions (2 hours)
- Managing difficult conversations (1.5 hours)
- Giving and receiving feedback (1 hour)
- Introduction to Design Thinking (2 hours)
Agile Scrum Methodology
- Scrum Introduction
- Scrum Team
- Scrum Artifacts
- Sprint Increment
- Spring planning
- Backlog
- Retrospective
- Project description and Case Study
- Practice exam and Knowledge check
- Certification (optional)
Structured Query Language (SQL)
- Working with SQL
- SQL Fundamentals
- Writing SQL Queries
- Working Tables and Indexes
- Predefined SQL functions
- Connecting Python to SQL
- Certification (optional)
Linux & Bash Scripting
- Working with Linux
- File System and Access
- Linux Fundamentals
- System Administration Basics
- Hands-on with Bash Shell Scripting
- Networking Services
- Internet protocols such as HTTP, TCP/UDP
Python Programming – Fundamentals
- Set up
- Set up development environment – Visual Studio Code
- Using python shell
- Executing python script
- Understanding python strings
- Print statements in python
Data Structures in Python
- Integers
- Lists
- Dictionaries
- Tuple
- Sets
- File
- Mutable and Immutable structures
Selection and Looping Constructs
- If/else/elif statements
- Boolean type
- “in” membership
- For loop
- While Loop
- List and Dictionary Comprehension
Functions
- Defining functions
- Variable scope – Local and Global
- Arguments
- Polymorphisms
Modules
- Creating modules
- Importing Modules
- Different types of imports
- Dir and help
- Examining some built-in modules
Site Reliability Engineering (SRE) Fundamentals
Introduction to SRE
- What is SRE?
- Reliability, Maintainability & Availability
- SRE principles
Understanding SRE Principles
- Seven principles of SRE
- Service level objectives
- Monitoring
- Automation
- Release Engineering
- Root cause analysis
- Testing and releasing
- Capacity planning
Understanding Reliability
- What is reliability?
- Life cycle of reliability
- Understanding failure and failure rates
- Understanding MTTR, MTTD, MTTF and MTBF
Maintainability
- What is maintainability?
- Life cycle of maintainability
- Up time and down times
- Preventive maintenance
- Maintainability costs, predictions, and requirements
Availability
- Introduction to Availability
- Inherent Availability
- Operational Availability
- Achieved Availability
- Monitoring tools such as Dynatrace
SRE Role
- What is a site reliability engineer?
- What does SRE do?
- Adopting SRE and SRE Team Formats
DevOps
- Introduction to DevOps
- What is DevOps?
- Goals of DevOps
- DevOps benefits
- Collaboration and Culture in DevOps
Version Control with Git
- Importance and need of version control
- Version Control Options
- Git Overview
- Setting up Git and repositories
- Using Git Commands
- Git workflows
Continuous Integration & Deployment (CI/CD)
Introduction to CI/CD
- Continuous Integration Pipelines
- Setting up CI/CD pipeline
- Continuous Integration with tools like Jenkins & GitHub
Best practices in DevOps
Containerization & Orchestration
- Containers with Docker
- Introduction to Containers
- Docker overview
- Docker commands
- Understanding Dockerfile
- Building Docker Containers
- Using Docker compose for building & testing software.
Container Orchestration with Kubernetes
- Introduction to Kubernetes
- Kubernetes Architecture & Clusters
- Deploying Applications with Kubernetes
- Scaling & Load balancing with Kubernetes
- Kubernetes networking
- Service discovery in Kubernetes
- Rolling updates & rollbacks
- Service meshes
Continuous Integration & Continuous Deployment (CI/CD)
- Introduction to CI/CD
- Introduction to CI/CD
- Continuous Integration, Continuous Delivery, Continuous Deployment
- Continuous Integration pipelines
- Creating pipelines
- Automating Deployments
Document Datastore: MongoDB
Working with SQL
- MongoDB Introduction
- Understanding Basics and CRUD operations
- Structuring Documents
- Create Operations
- Read Operations on Collections
- Updating Documents
- Deleting Documents
- Working with Indexes
- Working with different data types
- Using MongoDB Compass to explore data visually.
Apache Kafka Event Streaming
Understanding Apache Kafka Streams
- Understanding different ways of using Apache Kafka
- Working with Kafka Streams
- Operators in Kafka streams using KStream API
- Serialization & Deserialization in KStreams
KTable & Global KTable
- What is KTable?
- What is a Global KTable?
- Building a topology for KTable
Stateful operations in KStreams
- Aggregation and how it works?
- Using count, reduce and aggregate.
- Performing KStream Joins
Project & Use Case
- Project Overview
- Complete projects to get experience and practice.
- Industry Use Case Studies
Certification
- Certification Overview
- Identify the right certification for you.
- Tips to prepare for certification.