基于Google云平台的批量数据管道构建

课程概况

Data pipelines typically fall under one of the Extra-Load, Extract-Load-Transform or Extract-Transform-Load paradigms. This course describes which paradigm should be used and when for batch data. Furthermore, this course covers several technologies on Google Cloud Platform for data transformation including BigQuery, executing Spark on Cloud Dataproc, pipeline graphs in Cloud Data Fusion and serverless data processing with Cloud Dataflow. Learners will get hands-on experience building data pipeline components on Google Cloud Platform using QwikLabs.

课程大纲

Introduction

In this module, we introduce the course and agenda

Introduction to Batch Data Pipelines

This module reviews different methods of data loading: EL, ELT and ETL and when to use what

Executing Spark on Cloud Dataproc

This module shows how to run Hadoop on Cloud Dataproc, how to leverage GCS, and how to optimize your Dataproc jobs.

Manage Data Pipelines with Cloud Data Fusion and Cloud Composer

This module shows how to manage data pipelines with Cloud Data Fusion and Cloud Composer.

Serverless Data Processing with Cloud Dataflow

This module covers using Cloud Dataflow to build your data processing pipelines

Summary

This module reviews the topics covered in this course

同类课程

Building Batch Data Pipelines on GCP

课程概况

课程大纲

同类课程

可靠的Google云基础结构：设计和流程 – 法语版

Google Cloud 专业数据工程师考试准备 – 葡萄牙语版

基于Google云平台的TensorFlow无服务器机器学习 – 法语版

基本云基础架构：核心服务 – 日语版

这些课程也不错哦

商务日语之路初级篇上卷

在线日本语JLPT N3课程（全10 课）

Git/GitHub/GitLab完全教程2018

Vue 出一个电商网站

Angular 开发实战：从零开始

基础英文第二课

使用React进行全栈Web开发

利用Google云Apigee API平台进行API开发

声明：MOOC中国十分重视知识产权问题，我们发布之课程均源自下列机构，版权均归其所有，本站仅作报道收录并尊重其著作权益。感谢他们对MOOC事业做出的贡献！

© 2008-2022 CMOOC.COM 慕课改变你，你改变世界