In March 2017, Alibaba Cloud’s Video Cloud team announced its big video strategy to enter the video field. Within just one and a half years until The Computing Conference 2018 in Hangzhou this September, the Video Cloud team has released a number of products and solutions, achieving a comprehensive advancement of technology and services. The current Video Cloud is clearer, smarter, more stable, smoother, and more real-time accessible with unlimited possibilities.
In the Big Video Session at the Computing Conference 2018 in Hangzhou, Zhu Zhaoyuan, General Manager of Alibaba Cloud’s Video Cloud team, shared the team’s exploration in the video cloud field in the past year. He expounded on the definition of Video Cloud 2.0 from four aspects, namely, high definition, intelligence, stability and smoothness, and access.
Technology and Open Collaboration Bring a Clearer and Ultimate Video Cloud
When it comes to video, a topic that can never be avoided is the video definition. Video frames on display devices are composed of pixels. The number of pixels in an 8K video is 16 times that of a 1080p full-HD video. Within a certain distance, the pixel density has exceeded the resolution limit of the human eyes’ retina. Those electrifying frames look just like the real world. The ever-increasing video definition enables videos to carry entertainment and audiovisual content. What’s more, such technology can be applied to bring industrial-grade video applications to industries such as manufacturing, medical imaging, and urban management.
“The constant pursuit of video definition is the driving force of our Video Cloud. We have walked through the analog video era, witnessed the comprehensive digitalization of videos, and evolved from 480p to 1080p. Currently, Alibaba Cloud’s Video Cloud fully supports 4K video live broadcast and provides a full set of 4K live broadcast solutions including real-time transcoding, recording, and distribution to help with the network-wide upgrade to 4K. I am also very happy to share with you that we will work with our partners to push the ultimate HD experience forward.” said Zhu.
In the afternoon’s forum, Alibaba Cloud’s Video Cloud team released and demonstrated the 8K end-to-end Internet live broadcast solution. The team, which has teamed up with several enterprises to establish the 8K industry alliance, also demonstrated the first domestic 5G+8K remote medical consultation, aiming at pushing forward the commercial use of 8K. In addition, to enable more users to gain real experience of 8K technology, the team arranged an 8K end-to-end live broadcast solution demonstration in the exhibition area. This is the first time in China that the whole process of 8K live broadcast is displayed to the public.
AI Empowers a Smarter Video Cloud
With the substantial breakthrough of AI technology, our ideas are no longer confined to science fiction movies, but are turned into real applications in business systems to help customers improve efficiency, reduce costs, and even replace human labor. In the video field, Alibaba Cloud has also been committed to applying AI throughout the whole process of video production, review, management, distribution, and marketing. During the session, Zhu announced the release of three video AI products, hoping to help customers take another big step towards video intelligentization.
Intelligent Video Production Solution
The intelligent production solution has linked up AI with video technologies such as caster, cloud-based editing, and audio and video codec. We apply AI to all phases of video production to replace human labor as well as large and complex video production devices to achieve fast, high-quality, and secure production.
We take the following measures to improve the quality of produced videos: First, we collect the combination of AI special effects through the short video SDK and interactive live streaming SDK used in the device-end video production phase. Then, we efficiently and intelligently reproduce and process the videos through cloud-based broadcasting and editing. Finally, we apply our powerful audio and video media processing capabilities, image quality restoration, FRUC, and other technologies to the videos. In addition, we use video AI to perform video analysis, identification, and review instead of human labor and large and complex video production devices, achieving fast production of high-quality videos.
At this year’s Computing Conference, the data operations center used Alibaba Cloud’s intelligent video production solution to implement intelligent media production and intelligent highlight editing of surfing and 3x3 basketball games. In a very short period of time, the solution overcame difficult challenges regarding production effectiveness and quality, realized the perfect integration of cloud-based broadcasting, cloud-based editing, and video AI technology, recorded the most exciting moments for speakers and participants, and supported real-time download and forwarding.
Video DNA — Standardizing the Dissemination Order of Audiovisual Programs on the Network
Alibaba Cloud has a series of technical means to improve the quality and production efficiency of audio and video content. The advancement of these technologies will increase the fineness of videos and gradually inspire and enhance the video copyright awareness of video creators and the whole society.
“We have learned from copyright agencies that the number of video copyright registrations in the past two years has increased year by year. In the past, copyright registrations were mainly for films and TV plays, but now more short videos are applying for copyright registration. Moreover, we can see that the government and related institutions are gradually increasing the intensity of copyright supervision. In this context, we have developed and released Video DNA, which combines the video processing capabilities of Video Cloud with AI technology to quickly and accurately identify video duplication or piracy in a large amount of video big data. These jobs cannot be done by manual review. As a technical developer and also a member of the society, this is also a way for us to help original creators defend copyright, help platforms maintain a healthy ecosystem, and help the society improve copyright awareness.” said Zhu.
Video DNA can be used as a unique identifier of a video. It is unique, meaning that the probability of two different videos with the same DNA is less than one in ten million, close to zero. It is also stable, meaning that it will never change with the format conversion, splicing, cutting, compression, rotation, or logo addition of audio and video files. We guarantee the precision and recall rate of infringing or duplicated videos or video clips through various technical means.
“Certainly, application scenarios of Video DNA are more than copyright protection and originality recognition. We once helped a customer solve a problem, in which the repetition rate of 300,000 videos was as high as 29.6%. Video DNA can help the platform remove redundant data and improve the users’ viewing experience through personalized recommendations. Before using Video DNA, customers often don’t really understand the repetition rate of videos on their platforms.” said Zhu.
What’s more, Video DNA can also be applied in live broadcast scenarios in conjunction with the cloud caster to monitor and replace advertisements in real time, helping build a dynamic advertisement revenue sharing ecosystem.
Release of Intelligence Vision
As AI has proliferated, various industries are now using AI to solve image and video problems. However, various customization requirements still exist and it requires a lot of manpower and material costs to train customization models that meet business requirements. To solve these problems, Alibaba Cloud’s Video Cloud team officially released “Intelligence Vision”, which can help enterprises without any algorithm expertise train business-specific models through a small number of training samples in a very short period of time.
Intelligence Vision can be applied in the following fields:
- In the video field, it is common to export tags based on video content recognition, and then use such tags to search for, recommend, or place advertisements. Different users may have different video analysis requirements, and therefore it is necessary to train customization models for the specific content.
- In the new retail field, the operation of new retail is based wholly on the ability to identify products. New products are constantly appearing in stores, and therefore it is necessary to quickly identify a wide range of products.
- In the field of security monitoring, although there are already many models of people and cars in the industry, a large number of non-human or non-vehicle objects need to be further identified. For example, in urban management projects, we may need to identify whether street vendors are in the way. These requirements are common in the surveillance industry, but not a single model in the industry can solve such problems. Therefore, it is necessary to address such requirements through customization.
To solve these customization problems, Intelligence Vision provides a GUI on which users without any algorithm expertise can complete the whole process of model training from upload to labeling, training, and prediction with a single click, thereby training business-specific models with a small amount of data in the most efficient way. The product has currently been released for beta test. If interested, you may fill out an application on Retina to apply for free use.
Behind the product, we have very strong technical support. First, we use a distributed engine to provide one-stop services from data modeling to deployment. Second, to improve training efficiency, we use migration learning technology to make full utilization of Alibaba’s data, allowing users to train business-specific models with less business data. Third, if there are few samples, we use the industry-leading Auto Model Search technology to apply AI into model parameter adjustment, thus providing users with a higher model precision and recall rate. Finally, to enable users to quickly verify the effect of models, we use data enhancement technology to train models and help users determine whether such models meet their business requirements.
“While constantly improving our video AI products and capabilities, we keep asking ourselves what AI can bring to the industry? I believe that real-time and efficient, stable and secure, value-added, and intelligence are four key benefits that AI technology brings to the entire industry.” said Zhu.
More Stable and Smoother Video Cloud
Tempered by the booming of live broadcast in 2016, the short video rush in 2017, and the 11.11 Global Shopping Festivals and cross-year parties of major TV stations in previous years, Alibaba Cloud’s Video Cloud is now capable of providing stable and smooth broadcast services for large-scale sports events live on the Internet. In the past few months, Alibaba Cloud’s Video Cloud has provided video services for the World Cup Russia and the Asian Games Jakarta, and offered a super-large, ultra-high-definition solution integrated with our numerous technologies and comprehensive capabilities.
The primary element is our CDN distribution capability that can support the processing of tens of millions of concurrent requests. After years of development, Alibaba Cloud’s Video Cloud has built a very large-scale media processing and distribution infrastructure. Video Cloud’s products and technologies connect 1 billion devices, and 100 EB information is distributed annually through Video Cloud’s infrastructure.
During the World Cup, Alibaba Cloud supported as many as 24 million audience’s concurrent views of a single match based on its 1,500+ CDN nodes all over the world and 120 Tbit/s bandwidth capacity reserve. The CDN intelligent dispatching system and full-link disaster recovery measures also helped ensure the link stability and smoothness and the best effect of live broadcast.
In addition to allowing so many people to view the World Cup stably and smoothly, we also need to ensure ultra-high-definition video quality. Alibaba Cloud used the image quality restoration and 50-frame ultra-clear technology to produce more exquisite and clearer 30-frame videos during the World Cup. Combined with the aforementioned media processing and AI capabilities, we can help more customers easily obtain the ability to process video data at a low cost, thereby allowing video capabilities to seamlessly integrate into and support the customers’ business.
More Real-Time Video Cloud that Links the World Together
In the future, enterprises and users are more connected through videos. If enterprises choose to build audio and video communication services themselves, they require huge resource investment, technology investment, and maintenance costs in terms of network node construction, stability guarantee, interoperability and interconnection between different countries and carriers, network countermeasure algorithms, noise reduction and echo cancellation, and real-time audio and video codec technology.
Zhu officially released the audio and video communication product RTC (short for Real-Time Communication). He said: “Our goal is to build a global real-time audio and video communication infrastructure. After two years of development, application, and verification within Alibaba Group, RTC is now an enterprise-class service that connects more than 100 million users worldwide every day.”
RTC uses the self-developed intelligent scheduling, network self-adaptation and weak network countermeasure algorithms, audio 3A algorithm, self-developed ARWNT algorithm, and other technologies to improve network resource utilization efficiency by 30% and reduce bandwidth consumption by 10%. Even if packets are lost at a rate of 30%, it can still provide smooth video call services, allowing partners, enterprises, and entrepreneurs to focus more on their business.
According to Zhu, to allow more customers to benefit from the video technology, Alibaba Cloud will keep offering such bonuses as 20% off for all prepaid Video Cloud products, 40% off for CDN resource packages, and 400-minute free trial for the cloud caster, hoping that more users can easily access video capabilities.
In the end, Zhu said: “The media for human information transmission has developed from characters, to pictures, and then to videos. Alibaba Cloud’s Video Cloud has always been adhering to the mission of making information sharing simple.”