How Does Youku Build a System for the Smooth Playback of HD Videos?
Catch the replay of the Apsara Conference 2020 at this link!
At the sub-forum on Smart Entertainment Industry Practices at the 2020 Apsara Conference, Lei Man, Technical Director of the Youku Playback Center, talked about Youku’s construction of the playback system in five aspects. They are adaptive bitrate playback, data-driven optimization, transmission protocol optimization, device capability management, and video playback enhancement. Through his presentation, he enlightened developers from the audio and video industry.
The following content is a summary of his presentation.
When users watch videos on the Internet, they often encounter the following problems:
- Videos cannot be played, or the video stalls during playback. Video freezing is related to time, places, equipment, and network systems, which is something everyone has encountered.
- The screen flickers or blackens, and audio and video signals are not synchronized during playback. Sometimes the screen flickers during playback, or the screen becomes normal after the current clip is skipped.
The problems above problems are mainly due to the following technical challenges: changing user networks, scarce service resources, and huge differences with devices on the market. All of the problems are caused by the imbalance between the pursuit of high-definition and high-bitrate videos and the low bandwidth of the network.
To overcome the challenges above, the Youku Playback Center has designed systematic technical solutions and products. Through data-driven optimization, we can schedule and optimize business traffic in real-time based on the massive amounts of Internet users. We have solved the conflict between high bitrate and bandwidth jitter through adaptive bitrate playback. At the same time, some weak networks in the network transmission protocol can be improved.
We designed a system for device capability management to develop different playback policies based on the capabilities of different devices. For example, we would judge that whether the bitrate is H265 or H264, and select soft decoding or hard decoding. In the post-processing phase, after video decoding, we designed a video playback enhancement system to improve audio and video performance and try to eliminate the cases of “flickering screens and audio and video signals that are out of sync.”
What is adaptive bitrate playback? When using Youku, users can select “smart playback mode”. It is a feature that was developed based on adaptive bitrate playback. As we all know, the bitrate of Blu-ray and 1080p videos are above 2 Mbps, while the bitrate of SD videos is about 300 Kbps. If household Wi-Fi is only 500 Kbps, only SD videos can be played smoothly. Blu-ray videos will stall. In normal cases, 4G and ordinary household Wi-Fi can achieve the bandwidth required by the smooth playback of Blu-ray videos but once the signal is disturbed and multiple users race to control the bandwidth, playback smoothness will be affected.
Therefore, we use the network status inspection module and the downloader that processes the download request to report real-time download speed, time-out information, bitrate information, and playback status to the smart playback controller. By doing so, the decision engine can be instructed to determine whether to reduce or enhance the definition, achieving short-term smooth playback and long-term HD experience.
After real-time processing, the client player can give feedback on the results of upgrading or downgrading the video definition in various scenarios to the server. Then, the server sorts out and analyzes data reports, updates optimization policies, and delivers the results to the client to facilitate smart processing.
There are special cases, for example, the live streaming of the Double 11 Global Shopping Festival. During this time, network-wide resources are over-occupied. The smart playback mode can also control real-time traffic and downgrade disaster recovery. After more than a year of R&D and implementation, the usage rate of the smart playback mode rose to first place among all playback modes. The usage rate of the smart playback mode in large-scale live streaming exceeded 70%. In the smart playback mode, the video was played in its highest definition for over 90% of the duration and the stalling rate was the lowest among all playback modes. These properties meet the needs of users that want to watch “HD videos smoothly”. Video playback over the Internet depends on the storage and nearby distribution of video resources. Youku uses Alibaba Cloud CDN, the edge nodes of PCDN, and other methods to access and download resources. After being encoded and produced in the production center, video files are uploaded to OSS and CDN through push and pull technologies. Then, the videos can be downloaded and watched by nearby users.
There are two main categories of video resources on Youku. One category is popular videos, which only consist of a small number of videos but with large number of visits. The other category is unpopular videos, which are all the other videos with small number of visits. The number of “unpopular videos” is typically significantly higher than that of popular videos. All CDN service providers, including Alibaba Cloud, have two types of resources: premium resources and common resources. Devices with high-quality resources have high performance, high bandwidth, high stability from end-to-end, and high prices. They are scarce resources in the entire network and have limited reserves. Therefore, the best scheduling practice is to use high-quality resources to store popular videos and ordinary resources to store unpopular videos with low traffic.
For a shorter download response time and a faster download speed, we have designed different scheduling methods for popular and unpopular videos. For popular videos, high-quality CDN resources should be used as far as possible for local scheduling, so the bandwidth is stable, and the distance to users is shorter. Unpopular videos are scheduled nationwide, which improves the hit ratio of video files and reduces the back-to-origin time and bandwidth consumption. To distinguish unpopular videos from popular videos, we have developed the real-time calculation module to determine the popularity of Youku videos from the past based on playback requests. At the same time, by connecting with the operation system, the module can predict popularity for the videos that are not online yet and schedule them in advance based on the prediction.
In addition to ensuring download stability, the appropriate allocation of popular and unpopular videos uses the low-performance CDN resources with low prices, which effectively control costs.
In the process of developing the smart playback mode and the real-time scheduling system for popular and unpopular videos, we found that the original service quality measurement system cannot respond in real-time and is not supported by enough data. The implementation of playback policies entails the accurate and real-time measurement of service quality.
Therefore, we collected three types of playback data. The first type is “basic indictor”, including the time to the first byte and the proportion of slow speed. The second type is “factor indicator”, including node quality. The third type is “dimensional indicator”, including membership identity and playback scenarios. Through real-time data processing, the back-end server can be detected and restored in seconds, local network faults can be accurately identified, and manual responses from end-to-end are available 24/7.
In addition to accurate and timely scheduling, the connection success rate and transmission speed between a client player and CDN are improved. This is especially important in weak networks where the packet loss rate and latency are high. Generally, data connection downloads use HTTP over TCP. We have introduced the QUIC protocol based on UDP to specifically optimize weak networks.
First, a weak network prediction module is designed at the business layer to find the optimal entry time of QUIC. Then, latency detection and packet loss estimation are added at the protocol layer. More packets are sent out at a higher frequency to compensate for packet loss, and extremely weak networks are improved at a high cost.
Based on the online business and the latest testing results of playback, the QUIC-BBR significantly improves the transmission speed under weak networks. Strong networks also earn more than 10%, preparing for the introduction of video playback with higher bitrate in the future.
For iOS devices, such as iPhone and iPad, the standards are relatively consistent. However, for more commonly used video playback devices, such as Android phones, various PC terminals, OTT boxes, and televisions, their models, brands, chips, and processing capabilities vary widely. Youku videos are quite rich. Their resolution can be up to 4K and 120 frames. Due to the number of sources, tens of thousands of resolutions can be generated. For a video of non-standard resolution (4K and H265), 3840*1888 can be played on a mobile phone. However, when the video is played through an OTT box released a few years ago, the screen may flicker, and the video may stall since the box is not capable of hard decoding, and the soft decoding is of low-performance.
We have built a management system for equipment capability. The system records the capability characteristics and limits of the equipment, associates the content characteristics with the equipment, and makes various combinations of playing capabilities. Through this system, special videos can be played on long-tail devices. If the management system detects that the box cannot play videos with a special resolution, the player will be instructed to “divide the video height by 64 and round the number” for normal playback.
The following section introduces playback enhancement in the post-processing phase after video decoding.
We have developed the OpenRender system to meet the performance and compatibility requirements of development for audio/video enhancement and special effects. This system implements special effects, such as bullet comments, subtitles, and watermarks, as well as needs for enhanced post-processing of audio and video content. It also supports post-processing and rendering of hard-coded videos from different platforms. Based on OpenRender, features, such as HDR, surround sound, weak color, eye protection, and speed control, can be provided on devices.
As the display effect is greatly affected by the display hardware, we have tested and collected the screens of all mainstream manufacturers to carry out unified correction by using corresponding post-processing algorithms. Now, films will have the same brightness, contrast, and color on different devices on the mobile platform. Pictures can be restored as much as possible to express the director’s intention completely. Due to the limitation of equipment capability, TV has not been improved sufficiently. Our next plan is to combine the post-processing capabilities of chip manufacturers and complete machine manufacturers with Youku’s content to achieve the ideal performance.