你将学到什么
Learn the skills needed to be successful in a data engineer role
Prepare for the Professional Data Engineer certification
Learn about the infrastructure and platform services provided by Google Cloud Platform
Processing big data at scale for analytics and machine learning
课程概况
This program provides the skills you need to advance your career in data engineering and recommends training to support your preparation for the industry-recognized Google Cloud Professional Data Engineer certification. Through a combination of presentations, demos, and labs, you will enable data-driven decision making by collecting, transforming, and publishing data; and you’ll gain real world experience through a number of hands-on Qwiklabs projects.
You’ll also have the opportunity to practice key job skills, including designing, building, and running data processing systems; and operationalizing machine-learning models.
Upon successful completion of this program, you will earn a certificate of completion to share with your professional network and potential employers.
If you would like to become Google Cloud certified and demonstrate your proficiency to design and build data processing systems and operationalize machine learning models on Google Cloud Platform, you will need to register for, and pass the official Google Cloud certification exam. You can find more details on how to register and additional resources to support your preparation at cloud.google.com/certifications.
包含课程
课程1
Google Cloud Platform Big Data and Machine Learning Fundamentals
This 2-week accelerated on-demand course introduces participants to the Big Data and Machine Learning capabilities of Google Cloud Platform (GCP). It provides a quick overview of the Google Cloud Platform and a deeper dive of the data processing capabilities.At the end of this course, participants will be able to:
• Identify the purpose and value of the key Big Data and Machine Learning products in the Google Cloud Platform
• Use CloudSQL and Cloud Dataproc to migrate existing MySQL and Hadoop/Pig/Spark/Hive workloads to Google Cloud Platform
• Employ BigQuery and Cloud Datalab to carry out interactive data analysis
• Choose between Cloud SQL, BigTable and Datastore
• Train and use a neural network using TensorFlow
• Choose between different data processing products on the Google Cloud Platform
Before enrolling in this course, participants should have roughly one (1) year of experience with one or more of the following:
• A common query language such as SQL
• Extract, transform, load activities
• Data modeling
• Machine learning and/or statistics
• Programming in Python
Google Account Notes:
• Google services are currently unavailable in China.
课程2
Modernizing Data Lakes and Data Warehouses with GCP
The two key components of any data pipeline are data lakes and warehouses. This course highlights use-cases for each type of storage and dives into the available data lake and warehouse solutions on Google Cloud Platform in technical detail. Also, this course describes the role of a data engineer, the benefits of a successful data pipeline to business operations, and examines why data engineering should be done in a cloud environment. Learners will get hands-on experience with data lakes and warehouses on Google Cloud Platform using QwikLabs.
课程3
Building Batch Data Pipelines on GCP
Data pipelines typically fall under one of the Extra-Load, Extract-Load-Transform or Extract-Transform-Load paradigms. This course describes which paradigm should be used and when for batch data. Furthermore, this course covers several technologies on Google Cloud Platform for data transformation including BigQuery, executing Spark on Cloud Dataproc, pipeline graphs in Cloud Data Fusion and serverless data processing with Cloud Dataflow. Learners will get hands-on experience building data pipeline components on Google Cloud Platform using QwikLabs.
课程4
Building Resilient Streaming Analytics Systems on GCP
*Note: this is a new course with updated content from what you may have seen in the previous version of this Specialization.Processing streaming data is becoming increasingly popular as streaming enables businesses to get real-time metrics on business operations. This course covers how to build streaming data pipelines on Google Cloud Platform. Cloud Pub/Sub is described for handling incoming streaming data. The course also covers how to apply aggregations and transformations to streaming data using Cloud Dataflow, and how to store processed records to BigQuery or Cloud Bigtable for analysis. Learners will get hands-on experience building streaming data pipeline components on Google Cloud Platform using QwikLabs.
课程5
Smart Analytics, Machine Learning, and AI on GCP
Incorporating machine learning into data pipelines increases the ability of businesses to extract insights from their data. This course covers several ways machine learning can be included in data pipelines on Google Cloud Platform depending on the level of customization required. For little to no customization, this course covers AutoML. For more tailored machine learning capabilities, this course introduces AI Platform Notebooks and BigQuery Machine Learning. Also, this course covers how to productionalize machine learning solutions using Kubeflow. Learners will get hands-on experience building machine learning models on Google Cloud Platform using QwikLabs.
课程6
Preparing for the Google Cloud Professional Data Engineer Exam
From the course: "The best way to prepare for the exam is to be competent in the skills required of the job."This course uses a top-down approach to recognize knowledge and skills already known, and to surface information and skill areas for additional preparation. You can use this course to help create your own custom preparation plan. It helps you distinguish what you know from what you don't know. And it helps you develop and practice skills required of practitioners who perform this job.
The course follows the organization of the Exam Guide outline, presenting highest-level concepts, "touchstones", for you to determine whether you feel confident about your knowledge of that area and its dependent concepts, or if you want more study. You also will learn about and have the opportunity to practice key job skills, including cognitive skills such as case analysis, identifying technical watchpoints, and developing proposed solutions. These are job skills that are also exam skills. You will also test your basic abilities with Activity Tracking Challenge Labs. And you will have many sample questions similar to those on the exam, including solutions. The end of the course contains an ungraded practice exam quiz, followed by a graded practice exam quiz that simulates the exam-taking experience.
课程项目
This Professional Certificate incorporates hands-on labs using our Qwiklabs platform.
These hands on components will let you apply the skills you learn in the video lectures. Projects will incorporate topics such as Google BigQuery, which are used and configured within Qwiklabs. You can expect to gain practical hands-on experience with the concepts explained throughout the modules.
预备知识
为了充分利用这个程序,学习者应该基本熟练掌握一种常见的查询语言,如SQL;
有使用Python等公共编程语言开发应用程序的经验;有数据建模、提取、转换、加载等工作经验;
熟悉机器学习和/或统计学。