Bolster the growth and digital transformation of your business amid the outbreak through the Anti COVID-19 SME Enablement Program. Get a $300 coupon package for all new SME customers or a $500 coupon for paying customers.
By Alibaba Cloud ApsaraVideo Team
Conventional livestreaming technology can no longer meet the increasing requirements for video interaction. To address this challenge, in 2019, Alibaba Cloud and Taobao Live jointly unveiled the ultra-low-latency livestreaming solution Real-Time Streaming (RTS.) This solution is built on WebRTC technology and the UDP transmission protocol and supports large-scale concurrent livestreaming with an end-to-end latency of less than 1 second. Deployed on Alibaba Cloud CDN nodes, the RTS service reuses CDN nodes and network resources to strike a balance among access costs, node coverage, and carrying capacity.
According to QuestMobile’s “2020 Special Report on the Fight Against COVID-19 in China’s Mobile Livestreaming Industry”, during the pandemic, livestreaming has become the primary way people enjoy leisure and recreational activities, stay informed, and attend classes. Some industries that used to be highly dependent on offline interaction are now fighting to survive by turning to livestreaming. Some retailers have switched to livestreaming online to promote sales. Government departments have launched livestreaming programs to attract investment and promote agricultural products. E-commerce businesses have been increasing their investment in livestreaming capabilities during the pandemic. Now, they can offer platforms and services for livestreaming sales of slow-moving agricultural products and cloud-based sales of vehicles and real estate. They are also providing brick-and-mortar stores with livestreaming services to promote retail sales.
In addition to the online celebrities who used to dominate the various livestreaming platforms, a wide range of organizations and traditionally offline businesses have begun to offer a variety of livestream offerings, including virtual museum services, cloud dancing sessions, and cloud tours. This has attracted a large number of young Internet users. Livestreaming is no longer only for creating online, recreational content. It has gradually evolved into a basic business tool that is closely integrated with business scenarios.
The real-time and interactive characteristics of livestreaming make it a new medium for information transmission and interactive communication, but are real-time and interactive enough? Conventional livestreaming technology has an extremely long latency. When a viewer posts a comment, it takes anywhere from 5 to 10 seconds before the broadcaster can see the comments on the screen and respond. Here are a few examples of embarrassing situations:
- In a livestreamed class, a student may ask a question, but the teacher has moved on to the next topic before seeing the question.
- In a livestreamed session hosted by an e-commerce merchant, the broadcaster may seem like he/she is ignoring questions from viewers about the products when he/she has not seen the questions yet.
- A fan who gives a reward may not receive thanks from the broadcaster.
- Is there any point to livestreaming a sports game if you know when a team scores from the cheers before you see it happen on-screen?
High latency in livestreaming scenarios adversely impacts the interactive experience and hinders the commercial application of this technology. This is particularly important in the e-commerce scenario. Posting and replying to the comments are essential to the interaction between the audience and the broadcaster in a livestreaming session. The broadcaster’s real-time interaction or feedback can determine the level of activity and transaction rate during the session.
Conventional livestreaming solutions, such as RTMP push streams and FLV, RTMP, or HLS playback, usually have a latency of about 5 to 10 seconds. The latency is mainly caused by the following features.
- The stream push buffer includes the buffer for analog-to-digital conversion from the sensors to the screen, audio/video encoding buffer, and the buffer from the output screen to the network. However, encoding latency is the main cause of latency. This is determined by the encoder parameter settings of the push software, such as the use of b-frames, frame reference relationship settings, and compression performance. Take OBS as an example. When the output settings are as follows, the latency is less than 1 second.
The performance with the same settings varies on different operating platforms. The latency is about several hundreds of milliseconds on MacOS, but as low as 50 ms on Windows (according to real user cases.) Broadcasters usually have a stable push network connection and some use a leased line with a highly reliable connection.
- CDN link latency is partly caused by network transmission latency. The network transmissions in a CDN go through four segments. Assume that the latency caused by each segment is 20 ms and the overall latency of the four segments adds up to 100 ms. Using RTMP frames as the transmission unit means that each node must receive at least one complete frame before forwarding the fame to downstream nodes. Finally, CDN implements a packet delivery policy to improve concurrency, which increases latency to a certain extent. If there is network jitter, the latency will be worse. When network jitter is present, even if you use a reliable transmission protocol, the subsequent transmission process will be blocked, and you will have to wait for the re-transmission of the preceding packets.
- The playback buffer is the main source of latency. Public network environments vary a lot in quality. If network jitter occurs in any stage, such as stream push, CDN transmission, or playback and reception of streaming media, the performance of the streaming client will be adversely impacted. To cope with jitter in previous processes, client media players usually implement a media buffer of about 6 seconds.
Alibaba Cloud RTS
Conventional livestreaming technology can no longer meet the increasing requirements for video interaction. To address this challenge, in 2019, Alibaba Cloud and Taobao Live jointly unveiled the ultra-low-latency livestreaming solution Real-Time Streaming (RTS.) This solution is built on WebRTC technology and the UDP transmission protocol and supports large-scale concurrent livestreaming with an end-to-end latency of less than 1 second. Deployed on Alibaba Cloud CDN nodes, the RTS service reuses CDN nodes and network resources to strike a balance among access costs, node coverage, and carrying capacity. Tested on the market for more than a year, the overall experience and service quality of RTS has constantly improved.
Technical Architecture of Alibaba Cloud RTS
The following is the transmission diagram of the architecture:
The architecture and the transmission process shown in the proceeding diagrams are almost identical to today’s livestreaming system. The only difference is that we replaced the RTMP protocol with the RTP protocol and the TCP protocol with the UDP protocol on the playback link between the client and the CDN node. The RTS service has received both service and node upgrades. We monitor and optimize the end-to-end livestreaming metrics and have implemented a series of underlying technological optimizations through the intelligent scheduling system and network congestion, weak network resistance, and buffer policies. These measures enable better RTP over UDP resistance to packet loss and increase the stability of media player streaming performance compared with RTMP over TCP. This allows the player to reduce the buffer from 6 seconds to 1 second while keeping the overall latency at about 1–1.5 seconds.
How can I access RTS?
The RTS service currently provides two access services:
1. Upgraded network module with the WebRTC open protocol.
For users who independently develop media players or use open-source media players, Alibaba Cloud now provides a solution integrated with the standard WebRTC protocol that adds an RTS streaming domain name to the existing livestreaming services and uses the one stream push method and two stream pull methods. No modification is needed on the stream push side except for upgrading the player network module to pull ultra-low latency streams for playback. This approach makes the underlying network connection more transparent and the client more independent and controllable.
The preceding diagram shows the architecture of a common player. The player uses FFmpeg to open the network connection, reads audio and video frames, and puts them in the player buffer. Then, it decodes, synchronizes, and renders the audio and video data in sequence.
The lower part of the above diagram shows the overall architecture when access to RTS has been configured. If the RTS plug-in is added to FFmpeg to support private protocols and the buffer of the player is set to 1 second, the audio and video frames output by FFmpeg are directly decoded in the decoder for synchronization and rendering.
The RTS network SDK also provides APIs that allow the player to access Alibaba Cloud’s low-cost, multi-protocol, and low-latency network transmission infrastructure. The SDK features user-friendly APIs and a stable design, providing significant optimization for audio and video synchronization, quick loading, fluency, and other metrics. The APIs are designed to provide the FFmpeg demux plug-in, which makes it possible to integrate the service into applications by the method used to call other FFmpeg demux plug-ins. Non-FFmpeg APIs are also provided.
2. Integration with Alibaba Cloud RTS player
This method enables you to implement the RTS service more quickly. You simply have to add an RTS streaming domain name to the existing livestreaming service and integrate it with the [ApsaraVideo Player SDK. Then, the client can automatically recognize the player by using different URL parameters and implement the RTS service. ApsaraVideo Player is an SDK universal player, fully integrated with the ApsaraVideo service. In addition to on-demand video and livestreaming playback functions, it supports various business scenarios, such as encrypted video playback, secure download, video quality switching, and short video streaming, and provides users with a simple, fast, secure, and stable video playback service.
Based on core metrics, Alibaba Cloud’s RTS service delivers outstanding livestreaming performance. Given the same frame lag rate, the RTS service can reduce the livestreaming latency by 75%. Without requiring any improvement in the network latency or packet loss rate, RTS significantly improves the livestreaming experience by improving the livestream success rate, the frame lag rate, the quick launch rate, and other indicators. The RTS service is extensively used in Taobao Live to reduce latency and improve user interaction. Real-world online applications show that RTS significantly promotes transactions during e-commerce livestreaming sessions, with the UV conversion rate increasing by 4% and the GMV increasing by 5%. Currently, many well-known clients in education, e-commerce, and game casting have used RTS to launch livestreaming businesses.
While continuing to wage war against the worldwide outbreak, Alibaba Cloud will play its part and will do all it can to help others in their battles with the coronavirus. Learn how we can support your business continuity at https://www.alibabacloud.com/campaign/fight-coronavirus-covid-19