Instructor

raju2006

Scraping and Sourcing Data with Python

10 weeks

All levels

0 lessons

0 quizzes

0 students

Scraping and Sourcing Data with Python

Created By raju2006
Last Updated April 10th, 2025

Overview
Prerequisites
Audience
Curriculum

Description:

The ability to locate and acquire important data is a valuable skill for doing data analysis and data science. We’ll explore many sources and repositories for valuable data acquisition such as open government and university datasets. We’ll also explore popular social APIs (e.g., Facebook, Spotify, Twitter) and domain-specific APIs (e.g., healthcare, news, science and math) that store a wealth of data. Further, we’ll discuss methods to query web servers, and request and parse data to extract the information you need. We’ll also explore scraping various types of data from websites and how to read and extract text from documents (e.g., PDF, Word) along with methods to clean and store sourced and scraped data.

Course Code/Duration:

BDT131 / 1 Day

Learning Objectives:

After this course, you will be able to:

Explore Variety of Public Data Repositories
Understand Effective Means to Search for Valuable Data
Use the Python Programming Language to Source and Scrape Data
Use Popular Social and Domain-specific APIs to Access Data (e.g., Slack)
Extract Text from Documents (e.g., data in PDFs, Word)
Access PDF Tables
Scrape Data from Web Pages
Clean Scraped Data
Store Sourced and Scraped Data

Basic Python Programming

Anyone interested in working with Data

Course Outline:

Overview of Data Sourcing

Public Open Datasets
Government Data
University Data

Milestone 1: Explore public data repositories

Introduction to the Python Programming Language

Installing Anaconda

Milestone 2: Learn how to use Jupyter Notebooks

Using Public APIs (Application Programming Interfaces)

Explore Popular and Domain-specific APIs
Common Conventions
Parsing JSON

Milestone 3: Access a public API (e.g., Facebook, Twitter, Google)

Extracting Text from Documents

Milestone 4: Extract data from PDFs

Overview of Data Scraping

Introduction to BeautifulSoup
Parsing HTML and Javascript

Milestone 5: Scrape data from a website

Cleaning Scraped Data

Storing Sourced and Scraped Data

Conclusion: Next steps

Structured Activity/Exercises/Case Studies:

Milestone 1: Explore public data repositories
Milestone 2: Learn how to use Jupyter Notebooks
Milestone 3: Access a public API (e.g., Facebook, Twitter, Google)
Milestone 4: Extract data from PDFs
Milestone 5: Scrape data from a website

Training material provided:

Yes (Digital format)

The curriculum is empty

raju2006

242 Courses

0.0 Avg Review

[INSERT_ELEMENTOR id="19900"]

Looking for Team Training?

Up-skill your team with a customized, private training

Public Classes

Suitable for small teams and individuals

Get Started

Achieve your goals

Achieve your goals

transform your life through education

Achieve your goals

Achieve your goals

transform your life through education

Scraping and Sourcing Data with Python

Scraping and Sourcing Data with Python

Course Outline:

Overview of Data Sourcing

Milestone 1: Explore public data repositories

Introduction to the Python Programming Language

Milestone 2: Learn how to use Jupyter Notebooks

Using Public APIs (Application Programming Interfaces)

Milestone 3: Access a public API (e.g., Facebook, Twitter, Google)

Milestone 4: Extract data from PDFs

Overview of Data Scraping

Milestone 5: Scrape data from a website

Cleaning Scraped Data

Storing Sourced and Scraped Data

Conclusion: Next steps

Structured Activity/Exercises/Case Studies:

Training material provided:

raju2006

Looking for Team Training?

Public Classes

Get Started

Kickstart Terraform in a Day

Byte-Sized IoT Series: Internet of Things (IoT) Device Telem

Business Transformation with Google Cloud

How to develop Ownership and Accountability

Byte-Sized Deep Learning Series: Applied Deep Learning for

Headquarters

Quick Links

resources

About Us

Newsletter

follow us

Achieve your goals

Achieve your goals

transform your life through education

Achieve your goals

Achieve your goals

transform your life through education

Scraping and Sourcing Data with Python

Scraping and Sourcing Data with Python

Course Outline:

Overview of Data Sourcing

Milestone 1: Explore public data repositories

Introduction to the Python Programming Language

Milestone 2: Learn how to use Jupyter Notebooks

Using Public APIs (Application Programming Interfaces)

Milestone 3: Access a public API (e.g., Facebook, Twitter, Google)

Milestone 4: Extract data from PDFs

Overview of Data Scraping

Milestone 5: Scrape data from a website

Cleaning Scraped Data

Storing Sourced and Scraped Data

Conclusion: Next steps

Structured Activity/Exercises/Case Studies:

Training material provided:

raju2006

Looking for Team Training?

Public Classes

Get Started

Related Courses

Kickstart Terraform in a Day

Byte-Sized IoT Series: Internet of Things (IoT) Device Telem

Business Transformation with Google Cloud

How to develop Ownership and Accountability

Byte-Sized Deep Learning Series: Applied Deep Learning for

Headquarters

Quick Links

resources

About Us

Newsletter

follow us

Modal title