用Microsoft Cognitive Services开发人工智能语音应用

你将学到什么

Translate spoken content into other languages

Perform speech synthesis and recognition

Replace standard authentication with speaker verification

Integrate speech commanding into app experiences

Identify speakers via voice identification

Build a speech and speaker recognition app

课程概况

Microsoft Cognitive Services is a set of cloud-based intelligence services and APIs for building richer, smarter, and more sophisticated applications. The Speech APIs available in Microsoft Cognitive Services offer many ready-to-use and easy-to-consume features that help you use Artificial Intelligence (AI) to solve your business problems. In this practical course, take an in-depth look at Speech APIs, work through hands-on exercises to learn how to piece them together, and find out how to put them to work in your organization.

Start with an overview of Microsoft Cognitive Services, and then take a look at the Bing Speech API, which provides algorithms, exposed as simple REST-based service calls, to convert audio to text, understand speech intent, and convert text back to speech for natural responsiveness. Explore the Translator Speech API to add end-to-end, real-time, speech translation to applications and services. Get the details on the Speaker Recognition API, designed to perform speaker verification and identification. And dig into the Custom Speech API, which enables you to customize speech language models to perform domain-specific and use case-specific speech recognition.

Leverage the latest best practices and Fluent Design principles, as you learn how to create Windows 10 Universal Windows Platform applications that can run on multiple devices, including desktops, tablets, phones, HoloLens, and Xbox consoles. With a prerequisite of proficiency in a C-based programming language like C, C#, C++, or Java, follow along with the instructor as you work through the labs to replicate and modify code in the examples.

Wrap up the course by creating an application that authenticates users via speaker verification and searches relevant an popular news articles based on information returned from the Bing News Search API. The app can even optionally translate news headlines into your language of choice, using the Translator Speech API. From a general overview to specific use cases and hands-on practice, this course gives you what you need to create AI apps with off-the-shelf features in Cognitive Services Speech APIs.

课程大纲

Module 1: Bing Speech: Introduction to Microsoft Cognitive Service Bing Speech concepts and best practices, as well as integrating speech recognition and synthesis into applications.
Module 2: Translator Speech: Introduction to Microsoft Cognitive Service Translator Speech concepts and best practices, as well as integrating real-time speech translation into applications.
Module 3: Speaker Recognition: Introduction to Microsoft Cognitive Service Speaker Recognition concepts and best practices, as well as integrating speaker identification and verification into applications.
Module 4: Custom Speech Introduction to Microsoft Cognitive Service Custom Speech concepts, as well as integrating custom language models and speech recognition into applications.
Module 5: Final Project: Developing a Universal Windows Platform (UWP) application using various aspects of Microsoft Cognitive Speech Services.

预备知识

Intermediate coding skills in a C based language such as C, C#, C++, Java. Course will primarily use C#, knowledge of C# is recommended, but not a prerequisite.