你将学到什么
Validate and clean a dataset
Assess and create datasets to answer your questions
Solve problems using SQL
Build a simple testing framework to touch on AB Testing
课程概况
This course allows you to apply the SQL skills taught in “SQL for Data Science” to four increasingly complex and authentic data science inquiry case studies. We’ll learn how to convert timestamps of all types to common formats and perform date/time calculations. We’ll select and perform the optimal JOIN for a data science inquiry and clean data within an analysis dataset by deduping, running quality checks, backfilling, and handling nulls. We’ll learn how to segment and analyze data per segment using windowing functions and use case statements to execute conditional logic to address a data science inquiry. We’ll also describe how to convert a query into a scheduled job and how to insert data into a date partition. Finally, given a predictive analysis need, we’ll engineer a feature from raw data using the tools and skills we’ve built over the course. The real-world application of these skills will give you the framework for performing the analysis of an AB test.
课程大纲
Data of Unknown Quality
In this module, you will be able to create trustworthy analysis from a new set of data. You will be able to coalesce some nulls and identify unreliable data and discover reasons why data might be missing. You will also be able to answer ambiguous questions by defining new metrics.
Creating Clean Datasets
In this module, you will be able to name the main the categories of data types. You will be able to explain how the unfiltered data can be manipulated into a table where you can conduct data analysis. You will be able to discuss why a data warehouse is separate from a production database, and you will be able to use the tools you learned to create your own trustworthy tables.
SQL Problem Solving
In this module, you will be able to map out your joins and be able to highlight the level of detail needed for different kinds of questions. You will be able to practice answering data questions, which should help you feel ready to get asked a whole slough of questions, vague questions, ambiguous questions, or even poorly worded questions. Finally, you will develop a strategy for answering all those questions using data.
Case Study: AB Testing
In this module, you will be able to use your SQL skills to set up a basic AB testing system. You will be able to apply hypothesis testing to prove or disprove a hypothesis about how user behavior changed. You will be able to test and interpret the results using a metric or metrics that are tied directly to some business metrics. You will be able to test your SQL skills and give you the base experience you need to learn anything more complicated in terms of AB testing in the future.