高级数据科学顶点项目

课程概况

This project completer has proven a deep understanding on massive parallel data processing, data exploration and visualization, advanced machine learning and deep learning and how to apply his knowledge in a real-world practical use case where he justifies architectural decisions, proves understanding the characteristics of different algorithms, frameworks and technologies and how they impact model performance and scalability.

Please note: You are requested to create a short video presentation at the end of the course. This is mandatory to pass. You don’t need to share the video in public.

课程大纲

Week 1 - Identify DataSet and UseCase

In this module, the basic process model used for this capstone project is introduced. Furthermore, the learner is required to identify a practical use case and data set

Week 2 - ETL and Feature Creation

This module emphasizes on the importance of ETL, data cleansing and feature creation as a preliminary step in ever data science project

Week 3 - Model Definition and Training

This module emphasizes on model selection based on use case and data set. It is important to understand how those two factors impact choice of a useful model algorithm.

Model Evaluation, Tuning, Deployment and Documentation

One a model is trained it is important to assess its performance using an appropriate metric. In addition, once the model is finished, it has to be made consumable by business stakeholders in an appropriate way