See How Alibaba Cloud Helped Baijiayun Scale-out to Dozens of Times Its Original Capacity in Only Three Days
In order to win this inevitable battle and fight against COVID-19, we must work together and share our experiences around the world. Join us in the fight against the outbreak through the Global MediXchange for Combating COVID-19 (GMCC) program. Apply now at https://covid-19.alibabacloud.com/
By Anlin from Xiaozhangbang.
Due to the outbreak of the coronavirus disease (COVID-19), students and teachers who should have returned to their campuses had to turn to online education. The sudden spike in traffic created by this transition posed a great challenge for the online education industry in Mainland China.
Baijiayun, which has been serving educational enterprises for years in China, was no exception to this challenge. Baijiayun is an enterprise dedicated to providing one-stop cloud classroom solutions to educational institutions.
During the epidemic, many educational institutions asked them to build online cloud classrooms, adding up to a demand much higher than what the company typically handles. At the same time, more and more offline educational institutions launched their online education services, with online education and e-learning related web traffic quickly surging.
To respond to the call of China’s Ministry of Education to “keep students learning while classes are suspended,” employees at Baijiayun shortened their Lunar New Year holiday to start working from home on January 26 and then at the office from January 31 onwards.
No educational enterprises could have expected such an explosive increase in demand in such a short period of time. Li Gangjiang, the CEO of Baijiayun, said that Baijiayun’s business volume increased by several dozens. He also noted that it was extremely difficult to scale out quickly and do so in a way that is imperceptible to our customers.
Fortunately, Baijiayun’s exploration of an agile cloud architecture had prepared them well for such high-concurrency scenarios. Before the epidemic broke out, Baijiayun had optimized its container cluster architecture and planning with the help of Alibaba Cloud. Backed by the core solutions of Alibaba Cloud Container Service for Kubernetes and ECS Bare Metal Instances, Baijiayun had an easier time than most at implementing dynamic scaling and effective management of their online resources.
Using Containerization Technology to Cope with Large Traffic Peaks
Baijiayun was fortunate to have completed a container-based reconstruction of their services before the epidemic. Doing so helped them take on the situation much better than their competitors. Other online educational enterprises that do not use containers have to add multiple stacks of resources when there is a surge of users, which can result in a long deployment time and a sharp increase in business costs.
Since its founding in 2017, Baijiayun has been developing large-scale live courses. It is a cloud-based video company that has one of the purest focuses on education in the industry. In 2018, Baijiayun had revenues of over 100 million Chinese yuan (over 14 million US dollars) and served more than 1,000 educational companies in China.
Rapid business growth drove the Baijiayun technical team to explore the optimization of its previous technical architecture. In 2019, Baijiayun gradually introduced several smaller scale courses. Different from its large-scale courses, these courses record lessons by using audio and video capturing methods and then play back the lesson.
In this process, audio and video separation is required. It is costly to use virtual machines to separate audio from video. Moreover, processing audio and video in the same virtual machines can cause mutual interference. So, Baijiayun began exploring containerization, which is a much more lightweight virtualization technology than virtual machines.
In the first half of 2019, Baijiayun started small-scale container-based business containerization and successfully set up and verified the basic process.
However, as Baijiayun’s container scale expanded, scheduling and management become new challenges for the company. However, by leveraging the computing capabilities of Alibaba Cloud Container Service for Kubernetes, Baijiayun was able to greatly reduce the workload. As the Baijiayun technical team noted, containers reduced the O&M and test workloads, facilitated version control for the application runtime environment, and consumed less computing overhead than virtual machines, reducing overall IT costs.
At that time, enterprises started to use the container-based cloud native architecture. It helped Baijiayun make agile and flexible technical preparations for traffic peaks.
However, this was only the first step of the long journey. Baijiayun still faced challenges in coping with sudden traffic peaks.
Scaling out Dozens of Times in Capacity in Only Three Days
When the traffic surged, Baijiayun needed to scale out, and do so in a dramatic way.
The epidemic is everyone’s enemy. Baijiayun had no idea that they would have to face a battle against an epidemic at the beginning of this year. Most of their original configurations of container clusters were not designed for building such large-scale clusters. A single cluster could only host a limited number of nodes, and the original low instance specification restricted the capacity of each node.
To scale out, Baijiayuan went to Alibaba Cloud, whose team of experts recommended that Baijiayun purchase high-specification ECS Bare Metal Instances. Based on Baijiayun’s application load features, Baijiayun used ACK to manage ECS Bare Metal Instances of an appropriate specification to reduce unnecessary costs, prevent waste, and improve elastic supply.
This was so because Alibaba Cloud ECS Bare Metal Instance servers have superior specifications and can help Baijiayun significantly increase the capacity of a single node.
Baijiayun Kubernetes clusters have demanding performance requirements. But ECS Bare Metal Instance servers can deliver outstanding performance, and the combination of ACK and ECS Bare Metal Instances can easily meet Baijiayun’s requirements for coping with heavy traffic and high concurrency.
Container-based construction meets the requirements for fast and flexible service provisioning. ECS Bare Metal Instance servers eliminate virtualization loss and improve computing performance by 8%. Their quasi-physical-machine feature supports secondary virtualization.
Second, high-performance ECS Bare Metal Instance servers and elastic containers work seamlessly together. Data shows that containers that run on ECS Bare Metal Instances provide 10% to 15% higher performance than those that run on physical machines. This is because the virtualization overhead is offloaded to the MOC card, and the CPU or memory of ECS Bare Metal Instance servers has zero virtualization overhead. Each container that runs on cloud-based ECS Bare Metal Instance servers has an exclusive Elastic Network Interface (ENI), which improves network throughput by 13%.
Third, ECS Bare Metal Instance servers separate the storage bandwidth from the computing bandwidth, meeting the requirements for massive reads and writes in Baijiayun business scenarios. With the deployment of ECS Bare Metal Instance servers, computing power was significantly improved, but a storage I/O performance bottleneck also occurred. By using high-performance Alibaba Cloud NAS and scaling out the storage to four clusters, Baijiayun solved the I/O performance bottleneck.
With the preceding solution and its own large-scale cluster management capabilities, Alibaba Cloud helped Baijiayun effectively upgrade the original architecture solution and scale their capacity out by dozens of times in just several days. As a result, Baijiayun greatly improved their cluster performance and stability and was able to cope with burst traffic.
Optimizing Its Architecture and Cluster Planning
With sudden traffic spikes, Baijiayun urgently required quick, dynamic, and elastic scaling and efficient control and O&M.
To meet these needs, Baijiayun modified the original virtual nesting structure and used ECS Bare Metal Instances for high-density container deployment. Together with agile container management capabilities, Baijiayun reduced costs by at least 25% and reduced their O&M workload by at least 80%. Baijiayun also properly planned Kubernetes clusters and optimized the overall architecture, such as the network, storage solution and scaling rules, to ensure O&M stability and reduce application costs.
In addition, Baijiayun implemented Alibaba Cloud’s effective O&M and management tools, dramatically reducing their overall O&M workload.
As Baijiayun had limited time to deploy businesses on containers, they could not afford to spend much time on the O&M and monitoring of the containers. By using ARMS Prometheus, Baijiayun managed to set up monitoring for the container node environment in just 30 minutes. Compared with open-source Prometheus monitoring, ARMS Prometheus supports an unlimited data volume and can seamlessly work with Alibaba Cloud Container Service for Kubernetes. This enables Baijiayun to efficiently locate problems in containers and improve their products when needed.
The event center for small- and medium-sized applications in Log Service of Alibaba Cloud ACK displays cluster status changes and component exception events. This helps Baijiayun aggregate exception information in node logs to the control panel, and triggers alerts in a timely manner.
According to summary of Li Gangjiang, Alibaba Cloud brings the following benefits to Baijiayun:
1. Elastic computing space and agile and secure scaling
Alibaba Cloud supports the preloading of application images. By using this feature, Baijiayun can quickly load and run containers during scale-out. Alibaba Cloud Container Registry (ACR) securely manages a large number of Baijiayun’s container image assets. Through fine-grained image authorization and control, Baijiayun can securely and quickly manage application images throughout the image lifecycle.
2. Stable services and outstanding performance
Baijiayun uses Alibaba Cloud’s proprietary ESC Bare Metal Instance architecture with integrated software and hardware. ESC Bare Metal Instance has both the similar performance of physical machines and the user experience of VMs. By using ESC Bare Metal Instances, Baijiayun can better schedule Kubernetes clusters. This along with high-performance NAS helped Baijiayun solve the I/O performance bottleneck.
3. Timely response from the technical support team for optimizing the architecture
Baijiayun needed to scale out because its original business architecture was not designed to manage large-scale clusters. Alibaba Cloud helped Baijiayun optimize their business architecture and cluster management capabilities in a short period of time.
As the top cloud service provider in China and a leading cloud service provider globally, Alibaba Cloud delivers strong capabilities at the Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) layers. With extensive experience at the education Software-as-a-Service (SaaS) layer, Baijiayun can launch online education solutions jointly with Alibaba Cloud.
Baijiayun and Alibaba Cloud are deepening their cooperation, with Baijiayun to launch services soon on Alibaba Cloud Marketplace, a business platform of Alibaba Cloud SaaS Accelerator which is called the “Tmall of software”. In the future, users will be able to purchase Baijiayun services on Alibaba Cloud Marketplace in Mainland China.
While continuing to wage war against the worldwide outbreak, Alibaba Cloud will play its part and will do all it can to help others in their battles with the coronavirus. Learn how we can support your business continuity at https://www.alibabacloud.com/campaign/fight-coronavirus-covid-19