Aixuexi Education Group: Racing Against Time through Cloud-Native Practices

Image for post
Image for post

Step up the digitalization of your business with Alibaba Cloud 2020 Double 11 Big Sale! Get new user coupons and explore over 16 free trials, 30+ bestselling products, and 6+ solutions for all your needs!

By Shanlie

Background

The data from iiMedia Research shows that the market size of online education keeps increasing each year. In 2019, it was valued at more than 400 billion yuan. Driven by the COVID-19 pandemic, online education is accelerating its development, and its market is expected to be further expanded. It is estimated that the market size of China’s online education will reach 453.8 billion yuan in 2020.

Image for post
Image for post

Aixuexi Education Group’s Rapid Growth

The Aixuexi Education Group (formerly known as Gaosi Education) was created in 2009 and started its business by providing extracurricular training for students from primary and secondary schools. Aixuexi Education Group was originally a kindergarten through twelfth grade (K-12) education institution. In 2014, the Aixuexi Education Group joined the business market and is currently widely recognized by schools and institutions in China in the business-to-business field. Thus, it has been upgraded from a K-12 institution to a K-12 education supply platform.

In April 2019, the Aixuexi Education Group received a 140 million US dollars investment led by Warburg Pincus in the D round financing. Before 2014, as an enterprise focusing on “researching and developing education products,” Aixuexi Education Group provided personalized education products and education-related services for children aged 3–18. It has developed various online education products, such as “Siquan Chinese,” “Gaosi Mathematics,” “one-to-one Gaosi VIP,” and “Middle School Science.” It has also created many well-known sub-brands, such as “Leleketang,” “Aixuexi,” “Aishanggushi,” “Aijianzi,” and “Aitifen.” Extracurricular training providers and public schools use these products and sub-brand to improve children’s learning experience.

By 2029, the Aixuexi Education Group plans to serve 100 million students and five million teachers around the world and assist 500,000 schools. As a leading company with excellent content and technology, the Aixuexi Education Group will bring a better learning experience to students.

In the highly competitive online education market, how does the Aixuexi Education Group stand out?

Various Interaction Scenarios Can Improve User Experience, but Is It Stable?

In recent years, the rapid development of the online education industry has provided unprecedented convenience for the knowledge dissemination of society. Through various online education platforms, teachers can teach students remotely without the limitation of time and space. Based on streaming media transmission technology, online classes provide the same kind of experience for teachers and students since they are still having a face-to-face class. Online classes are welcomed by a vast number of users.

Streaming media transmission technology is the core technology used in the construction of online class applications. To create an excellent online class application, it is not enough to solely rely on streaming media transmission technology. Therefore, Alibaba Cloud integrated various class interaction scenarios with the application to improve user experience and learning efficiency. These interactive scenarios include asking questions, class presentation, repeating, liking, textbook navigation, and real-time display of whiteboard content. Improving the operation fluency of these interactive scenarios as much as possible is the common pursuit of technical teams in many online education platforms. It also reflects the competitiveness of online class applications. The R&D Team of Aixuexi Education Group has ultimately improved the user experience in interaction scenarios of online classes through continuous iteration of technologies.

The rapid growth of online users is a huge challenge to the R&D Team of Aixuexi Education Group. With the elastic scaling capability of cloud computing, livestreams can directly use cloud services to meet the needs of high-concurrency. However, the challenge is how to support the stable operation of these interaction scenarios. By supporting the horizontal scaling of nodes, the mature distributed microservices architecture can easily cope with a sudden surge of HTTP requests. Unfortunately, the standard HTTP communication can only handle one-way requests from the client to the server, as shown in the following diagram:

Image for post
Image for post

Online class interaction scenarios involve communications between different users as well as an active notification sent from servers to clients. Therefore, the following communication model needs to be established:

Image for post
Image for post

Take the scenario of sending text messages in class as an example. When a student submits a text message, all other users in the same class can see this message. To ensure that the content of the message does not violate any rules, the server has to review the content first. The server filters out sensitive information before sending it to other users in the online class. It is similar to the following process:

Image for post
Image for post

Therefore, the essence of online class interaction is the two-way communication between the server and the client. To realize better interaction scenarios in online classes, a robust, scalable, high-performance, and high-cost two-way communication mechanism needs to be established. However, the standard HTTP protocol cannot support a service scenario where the server actively sends notifications to the client. Therefore, traditional web architecture cannot meet the requirements of interaction in online classes. There are many solutions to this problem in the industry. The simplest way is the polling solution based on HTTP. Developers only have to make a few modifications to HTTP, so the client can periodically send requests to the server for getting back its notifications, as shown in the following diagram:

Image for post
Image for post

This solution, to some degree, can enable the server to actively send notifications to the client. However, it cannot meet requirements for real-time performance, and it generates a large number of empty polling in the case of no notification. There is another solution available besides the polling solution. The HTTP-based long polling solution improved based on general polling. The Long polling solution can avoid empty polling with limited optimization, but it cannot meet the real-time requirements of interaction scenarios.

Among all solutions based on the HTTP protocol, the WebSocket-based solution can implement the native two-way communication between the server and the client. By establishing a WebSocket connection, the server can send notifications to the client in real-time. This is also a commonly adopted solution for lightweight Instant Messages (IM) based on web pages. The R&D Team of Aixuexi Education Group has also considered using the WebSocket-based solution to implement interaction scenarios in online classes. However, after some research, they found limitations with the solution and decided against it.

In online classes, interaction scenarios have two features:

1. A single message is often sent to all clients at the same time.

The sending message scenario mentioned above is a very typical example. Messages sent by one user will be visible to all other users in the same class.

2. The number of online users is large when tens of thousands of classes start simultaneously during peak periods.

The WebSocket-based solution requires the server to establish connections with each client. If a notification needs to be sent to multiple clients, the sending logic will be implemented on the server.

Image for post
Image for post

When the number of users rises sharply, this architecture will put great pressure on the server, making it difficult to support a large number of online users at the same time. Therefore, the R&D Team of Aixuexi Education Group decided to abandon the WebSocket-based solution and try other ways to support interaction scenarios in online classes.

The implementation of the application-layer communication protocol based on TCP or UDP is also a solution that has been studied in depth by the R&D Team of Aixuexi Education Group. This flexible solution can expand the message transmission through customized protocols. To achieve this, the R&D Team of Aixuexi Education Group has to design an application-layer communication protocol, which is an extremely complex task with various challenges, such as:

  • How do we handle a suspicious connection?
  • How do we reconnect a client when they get disconnected?
  • How do we resend a message that gets lost during transmission?
  • How do we achieve authentication and permission management?
  • How do ensure the horizontal scaling of server nodes?

These are all factors the R&D Team should consider. The R&D Team of Aixuexi Education Group is capable of handling these details to design a communication protocol suitable for their business scenarios. However, time costs and risks are very high. With the rapid development of business, the R&D Team needs to race against time to find a new solution for interaction scenarios in online classes to support massive amounts of online users.

Through the experience accumulated by continuous technology research, the R&D Team of Aixuexi Education Group has reached a conclusion. To meet the needs of interaction in online classes, technical architecture must be able to meet the following requirements:

  1. The ability to support massive online clients simultaneously
  2. Stability and high availability
  3. The server can scale out horizontally if the performance cannot meet requirements
  4. An intermediate module is used for message distribution to reduce the cost of message replication
  5. It must be simple and easy to use and can be quickly applied online.

Among them, the best solution to meet the fourth and fifth requirements is implemented based on the Message Queuing Telemetry Transport (MQTT) protocol. The MQTT protocol is a client-server-based messaging and subscription transport protocol that provides reliable network services for devices in low bandwidth and unstable network environments. The MQTT protocol is open, simple, lightweight, and easy to use. It supports one-to-many messaging to decouple applications. It also ensures message reachability, which is very suitable for sending notifications to the mobile client.

Image for post
Image for post

However, MQTT is just a protocol. Before applying the MQTT protocol to large-scale commercial scenarios, it needs support from mature and stable products. None of the MQTT solutions from the open-source community have gone through rigorous testing and commercialization. When the number of clients reaches 1,000, their performance sharply declines. Therefore, they cannot support tens of thousands of online clients at the same time.

Focusing on interaction scenarios in online classes, the R&D Team of Aixuexi Education Group has made in-depth exchanges with technical experts from Alibaba Cloud. After several rounds of testing and evaluation, the R&D Team finally decided to use Alibaba Cloud’s micro-Message Queue (mMQ) for MQTT to build its interaction platform for online classes. Compared with open-source solutions, Alibaba Cloud’s mMQ for MQTT has been tested and evolved within Alibaba Group. It supports tens of millions of online connections, millions of concurrent messages, and notification sending at the millisecond level. It is designed with a distributed architecture. With no Single Point of Failure (SPOF) and infinite horizontal scalability between components, it ensures the elastic scalability of capacity and transparency to users.

Image for post
Image for post

In this architecture, clients on the Internet access Alibaba Cloud’s mMQ for MQTT through the standard MQTT protocol. The SDK of the MQTT protocol covers almost all mainstream development languages and can be adapted to the unstable network of the mobile client. Server cluster on the cloud accesses Message Queue (MQ) for RocketMQ through the RocketMQ protocol and realizes two-way interconnections between servers and clients through protocol conversion between RocketMQ and MQTT.

Why was a new component, MQ for RocketMQ, introduced? Why are server instances accessed to MQ for RocketMQ through RocketMQ protocol? There are three main reasons:

1. Compared with client instances, server instances are smaller in scale, while the message throughput of a single server instance is larger than that of the client.

In a typical online class scenario, tens of thousands (or hundreds of thousands) of clients are connected to the server at the same time. Each client should send and receive no more than ten messages per second. However, each server instance may process tens of thousands of messages per second in a cluster with 100 instances. The difference between the server and the client determines that they need to use different communication protocols for access to maximize their performance and efficiency.

2. When the processing capability of the server is insufficient, messages need to be temporarily stored in the queue.

The introduction of RocketMQ provides message storage for MQTT.

3. Multiple instances of the server cluster imply peer-to-peer and task-allocation relationships. The RocketMQ-based cluster consumption mode can provide the native load balancing mechanism.

Let’s come back to the message sending scenario in online classes. After the message is submitted, an instance in the server cluster receives the message through the load balancing mechanism. After reviewing, the message can be distributed to other users in the same class by delivering it to the MQTT-based cloud service. Thus, for server instances, they only need to establish connections with the MQTT-based cloud service for serving tens of thousands of users at the same time.

Image for post
Image for post

When the server instance encounters a performance bottleneck during peak hours, this issue can be solved by adding more server instances. For MQTT-based cloud services, the performance improvement of it can be achieved by upgrading configurations according to rules, which has little influence on the operation of the application.

Based on this architecture, the R&D Team of the Aixuexi Education Group built a complete class interaction system in half a month. The R&D Team does not need to focus on complex technical issues at the application layer, such as bad network environments, reconnection, exception handling, high concurrency, and high system availability. This architecture helps the team greatly reduce development costs and improve user experience.

To support the rapid increase of users brought by the “On-Cloud Learning” program, Aixuexi Education Group has expanded the capacity of this system several times. As a result, it has successfully resisted several traffic peaks and ensured the stable operation of the business.

Note: In response to China’s “classes have stopped, but learning will continue” proposition from the Ministry of Education of China, and to transform from offline teaching to online teaching in time, the Aixuexi Education Group has launched the “Learning on the Cloud” program. The Aixuexi Education Group has opened up high-quality content and provided livestream capabilities to help local K-12 institutions migrate the business to the cloud at the same time. More than 9,000 institutions teach online through the Aixuexi Education Group’s online platform. The Aixuexi Education Group is helping more institutions smoothly transform and promote the “On-Cloud Learning” program together.

The “On-Cloud Learning” program has been highly recognized in China. At the same time, the R&D Team of Aixuexi Education Group also continues to iterate the system architecture and apply MQTT technology to more two-way communication scenarios between devices and the cloud. By doing so, the Aixuexi Education Group can solve more challenges in the future. Li Chuan, Co-Founder and CEO of Aixuexi Education Group, said, “We would like to do something more valuable, and we have better expectations for the future.”

Original Source:

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store