课程概况
Welcome to the course “Augmented Reality & Video Service Emerging Technologies.” The level of AR (Augmented Reality) and advanced video & multimedia technology included in a product is what determines the level of value and luxury. The objective of this course is to teach all important technologies that are used in state-of-the-art AR, Skype, and YouTube video and multimedia products and services. This includes the advanced video and real-time multimedia delivery mechanisms based on H.264/MPEG-4 AVC, MPEG-DASH, CDN, and mobile CDN. If you have knowledge of these core technologies, you can understand the operations that are used in every advanced video and multimedia system in the World. As the future World of business and products are driven to be more and more video and multimedia oriented, having knowledge of these core technologies will enable you to lead your company to become the true World leader in AR and video multimedia technology products, services, and business. Thus, I cordially welcome you into the beautiful and powerful World of advanced AR and video multimedia!
课程大纲
AR Applications, Products & Business
The first module “AR Applications, Products & Business” focuses on the variety of AR (Augmented Reality) applications, technologies, products, and businesses. The lecture starts with the definitions and characteristics of AR and explains the differences of AR and VR (Virtual Reality). Then the advantages of various AR UI (User Interface) types, which include handheld AR displays (e.g., smartphones), AR eyeglasses, and HMDs (Head-Mounted Displays) are covered. In addition, AR business models and an analysis of the AR market, which includes the AR/VR headset market share, Worldwide AR/VR headset forecast, AR/VR market size by segment, and AR/VR market size forecasts are described.
AR Technology
The second module “AR Technology” focuses on AR (Augmented Reality) technologies, operation workflow, and Cloud support technologies. First, the features of AR technological components, and the role of AR feature detection/description technology and the IPD (Interest Point Detection) process is introduced. Second the advantages of AR cloud cooperative computation and AR cloud offloading is covered. In addition, the types of AR feature extraction descriptor types, feature detector requirements, and influencing factors are covered.
SIFT SURF FAST BRIEF ORB BRISK
The third module “SIFT SURF FAST BRIEF ORB BRISK” focuses on all of the core feature extraction technologies used in AR (Augmented Reality), which include SIFT, SURF, FAST, BRIEF, ORB, and BRISK. As feature extraction is the most important (and computation burdening and time consuming) procedure of the AR process, the variety of technologies applied in state-of-the-art AR devices are studied in detail in this module. The lectures cover the characteristics of the AR IPD (Interest Point Detection), feature detection, and description schemes, which include SIFT (Scale Invariant Feature Transform), SURF (Speed-Up Robust Feature), FAST (Features from Accelerated Segment Test), BRIEF (Binary Robust Independent Elementary Features), ORB (Oriented FAST and Rotated BRIEF), and BRISK (Binary Robust Invariant Scalable Keypoints).
Skype, YouTube & H.264/MPEG-4 AVC
The fourth module “Skype, YouTube & H.264/MPEG-4 AVC” focuses on the two most famous video service types that exist in the World. Skype is the most widely used video conferencing and VoIP (Voice over IP) application service in the World, which is now included in various Microsoft products, making video and voice communications possible from practically anywhere an Internet connection is available. YouTube is the World’s most widely used video service application service. The lectures cover the history of Skype and YouTube and also the evolution of their video and audio codec technologies. In addition, the lecture covers the details of the state-of-the-art H.264/MPEG-4 AVC video media technology that is currently used by Skype and YouTube.
Video Streaming & MPEG-DASH
The fifth module “Video Streaming & MPEG-DASH” focuses on advanced video streaming techniques and details on MPEG-DASH (Moving Picture Experts Group - Dynamic Adaptive Streaming over HTTP) technology. First, the differences in Push vs. Pull based media streaming is covered along with the operation process of Pull based adaptive media streaming. Second, the types of video frames along with the structure of the fragmented MP4 file and GOP (Group of Pictures) are studied. Third, HTTP (Hypertext Transfer Protocol) versions 1.0~2 and the DASH scheme is explained followed by examples of the YouTube MPEG-DASH progressive downloading process. Fourth, the standardization of ISO/IEC 23009-1 based MPEG-DASH specifications and the operation process of MPEG-DASH MDP (Multimedia Presentation Description) hierarchical data and MPD decoding & playing methods are covered.
CDN Video Streaming Technology
The sixth module “CDN Video Streaming Technology” focuses on the necessity and operations of advanced video service CDN (Content Delivery Network) technologies. First, the CDN structure and the operation process of CDN hierarchical content delivery is covered. Second, the CDN market value, market size, service providers, and the role of the Telcos, CDN providers, operators, and market regions are studied. Third, details on CDN cooperative caching and content routing, Query based scheme, Digest based scheme, Directory based scheme, Hashing based scheme, and the Semi-hashing based scheme are covered. Fourth, content aging and updating operations along with CDN popularity prediction and contents update techniques (with operational examples of the LRU (Least Recently Used) and LFU (Least Frequently Used) strategies) are covered in the lectures. In addition, the differences in CDN vs. Mobile CDN technology are discussed.
AR Smartphone Project
The seventh module “AR Smartphone Project” focuses on two AR smartphone projects using the IKEA Catalog and Google Translate applications. First the operation features of the IKEA Catalog and Google Translate AR applications are studies in the project. Second the limitations of the operation process of these AR applications are tested to recognize how the brightness levels, shined light angles, shape and size of the area and object, distance, font and texture types, and language translations types can influence the accuracy of the AR operations.