How Alibaba Cloud Helped Tmall Expand the Double 11 Shopping Festival for the Past 10 Years
In 2009, two seemingly unrelated events happened: Alibaba Cloud was formed and Tmall launched a sales campaign called Double 11 Shopping Festival.
At the end of early spring, Alibaba Cloud engineers wrote down the first line of code of the Apsara system in an office building without heat supply in Beijing. On November 11 of the same year, Tmall launched the Double 11 Shopping Festival. Nobody expected that years later, they would be what they are today.
Prequel to Double 11 Shopping Festival
In 2007, Taobao’s gross merchandise value (GMV) reached RMB 40 billion. Although a significant achievement, it did not eliminate the technical team’s concern regarding their centralized architecture. The team knew that they needed to do something to keep up with the ever-growing business. They reconstructed the architecture to a distributed architecture, based on which, all Taobao businesses became modularized later.
This technology upgrade now seems to have many limitations. It aimed to use throttling to deal with the growing cache of images in CDN. However, it was this reconstruction that brought about the first Double 11 Shopping Festival in 2009. On that day, the data volume reached a new peak, and the new distributed architecture withstood the high pressure, which was almost impossible to accomplish in the old architecture.
One year later, the first edition of the Apsara system was released. A cluster of only a few dozen machines began to serve the first internal customer, Ant Financial.
These explorations have shown everyone a trend: If you replace traditional expensive UNIX server hardware and software with generic x86 server clusters in a distributed manner and use new technologies such as virtualization, you can use the computing service in a Pay-As-You-Go mode at any time.
Traffic Rush: Surge Computing
From the birth of computers to the 1990s, computing resources were considered as “scheduled” resources. Computing resources can be scheduled regardless of whether they are used for exploring the moon or studying the mystery of genes. In the Internet era, however, an explosive event may challenge all computing resources, resulting in uncertainty.
There is no doubt that the Double 11 Shopping Festival is epitomizes this concept.
The technical team will never forget the nightmarish traffic peak during the Double 11 Shopping Festival in 2011. On that day, many merchants’ products were oversold on Tmall because of technical issues of the system. Afterwards, this group of experts realized that the Double 11 Shopping Festival was not just a business campaign. It became a technical challenge.
In the following year, Tmall, Alibaba Cloud, and HiChina jointly announced the launch of the cloud.tmall.com platform. This platform was the first to take cloud computing as its base. It provided IT infrastructure for merchants and service providers on Tmall and Taobao.
Based on the ECS servers, ApsaraDB for RDS, and SLB network of Alibaba Cloud, merchants’ business orders were continuously pushed to the platform, which ensured the stability and continuity of data. This was the first time cloud computing was involved in the Double 11 Shopping Festival. Cloud computing helped Tmall’s GMV hit RMB 19.1 billion on that day.
The rapid growth of the Apsara system had brought great confidence to all people. The Apsara cluster capacity had increased from 1,500 machines to 3,000 machines in five years. In August 2013, the number of machines that one cluster could contain exceeded 5,000. In addition, cross-IDC computation using multiple clusters was supported.
In the next few years, cloud computing gradually became the cornerstone of the shopping festival. By 2014, Alibaba Cloud services had been used for 96% of the deals on the cloud.tmall.com platform.
In 2015, Alibaba constructed the world’s biggest hybrid cloud which seamlessly combined the public cloud and Apsara Stack for the Double 11 Shopping Festival.
These results have been eventually used to serve the whole society. Surge computing has been widely used in many fields and scenarios to deal with various events such as booking tickets at 12306.cn during the Spring Festival, celebrating the World Cup, and blogging at Weibo.
Data Platform: Computing Creates Value
Around 2013, Pony Ma’s “Ticket Theory” was very popular in Internet circles. People in these circles believed that only companies that had a “ticket” had a future. They did not have a clear idea of how to get the “ticket.”
However, new requirements for processing massive amounts of data generated on mobile terminals have promoted the development of the big data platform of Alibaba Cloud.
People rarely know that Alibaba already had the first big data warehouse in 2004. At that time, they predicted the future market trend based on the data in the warehouse. In 2008, the business volume and data volume of Taobao have reached thousands of times those in 2004. It became very important to reconstruct the big data technology from the bottom layer so that data could be used as production materials.
In the spring of 2010, Alibaba launched the first edition of SQL Engine (later called MaxCompute, a big data engine), and ran it in an Apsara cluster consisting of 30 machines.
In 2011, the Apsara team began to explore the way to support the company’s internal data warehouse service. They executed the production jobs of DataWorks and MaxCompute in parallel in a cluster consisting of 1,500 machines and found that the performance and stability were even as good as those of Hadoop.
After the Apsara 5K project was completed, MaxCompute was able to be run in a cluster of 5,000 machines, be scheduled across IDCs, and sort 100 TB data in 377 seconds.
Powered by MaxCompute, Alibaba has brought people with personalized service experience during the Double 11 Shopping Festival since 2014. MaxCompute has become a major computing platform of Alibaba.
- 99% of Alibaba’s data storage services and 95% of its computing services rely on MaxCompute.
- More than five million jobs are executed on MaxCompute every day.
- On November 11, 2017, MaxCompute processed more than 320 PB data.
Over the past few years, the “Ticket Theory” has disappeared, yet the value of the big data platform has come to the fore. Through the combination of big data and AI algorithms, MaxCompute has become the most important component of ET Brain. ET City Brain can automatically coordinate traffic lights and it has been successfully applied to traffic management in Malaysia. ET Industrial Brain has been trying to help manufacturers improve the yield rate by 1%.
In solutions from Digital Alibaba to Digital City, MaxCompute provides EB-class data storage capabilities. It is the first big data computing platform to pass the Bigbench test with 100 TB data. It has been adopted by a dozen countries and regions on the public cloud and has been deployed for more than 100 projects on Apsara Stack.
Largest Human-Machine Collaboration in History
The Double 11 Shopping Festival in 2017 is considered a “Super Project” with the most large-scale human-machine collaboration in human history. Machine intelligence has been used in all aspects such as technological operations and maintenance, item recommendation, customer service, payment, and logistics.
Data, computing, and algorithm are three core factors in AI. On the basis of general computing services, the Apsara team began to explore AI-oriented heterogeneous computing.
On September 12, 2017, Alibaba Cloud announced the launch of a new heterogeneous computing acceleration platform. It was the first platform in the industry that covered all the six mainstream heterogeneous instances (including AMD GPU, NVIDIA GPU, Intel FPGA, and Xilinx FPGA). In addition, it provided a computing power of up to 75 TFLOPS.
The new infrastructure has made it possible for the world’s most large-scale human-machine collaboration. During the Double 11 Shopping Festival, large numbers of AI and video transcoding services are deployed in the ECS GPU cluster. AI services, such as video intelligence processing of ApsaraVideo, Alime, Pailitao (snap and buy), and intelligent supply chain management of New Retail, are all accelerated by Haotian (a heterogeneous computing acceleration platform of Alibaba Cloud).
- Alibaba data center robot Halo-Explorer performed inspections in IDCs every day. It replaced O&M personnel to execute 30% of the routine tasks.
- AI scheduler Daling increased the resource allocation rate of IDCs to a value higher than 90%.
- AI assistant Alime took 95% of customer service consultations on the day of the shopping festival.
- Cainiao smart warehouse robots shipped more than 1 million packages in a single day.
- AI designer Luban designed 410 million product posters during the shopping festival.
- On the day of the shopping festival, Alibaba intelligent recommendation system generated more than 56.7 billion exclusive “shelves” for users, providing users with personalized service experience.
At the Computing Conference in Wuhan half a year later, Alibaba Cloud and its partners exhibited the AI smart ordering equipment for the first time. In the absence of any wake-up words, customers talked to a machine at a speed of five words per second to order items and changed the words frequently. The machine responded to each interaction with precision.
These infrastructures and commercial products are now serving a wide range of industries.
Double 11 Shopping Festival 2018 powered by Apsara 2.0
Initially, the technology revolution was to solve the peak traffic problem of the platform. Now, new technologies are leading the business transformation.
During the Alibaba 2018 Double 11 Shopping Festival, Alibaba Cloud set a new record in “surge computing.” The accumulated number of ECS cores that are dynamically allocated by Alibaba Cloud exceeded 10 million, which is equal to the capabilities of ten large IDCs. Apsara 2.0 also provided the following capabilities to fully support the shopping festival:
- ECS Bare Metal Instance of Alibaba Cloud played a significant role in the core system. ECS Bare Metal Instance was developed based on X-Dragon, a next-generation virtualization architecture with hardware and software combined. The instance combined the advantages of physical and virtual machines and resolved all QPS performance limitations during peak hours in the core system.
- ESSD was used, which was the first in the industry to be capable of supporting 1 million IOPS and storing tens of petabytes of data. It can easily handle the highest I/O concurrency.
- Safeguarded by the highly reliable ApsaraVideo Live solution, the webcasting of Tmall Double 11 Festival Night covered more than 25 million viewers on Youku and set a new record in maximum bandwidth.
- CDN provided the acceleration service for more than one-third of Internet traffic in China, and ApsaraVideo Lazada provided webcasting services for users outside China.
- It was the first time IPv6 was put into large-scale commercial use in the Chinese market. IPv6 is fully supported by the cloud, networks, devices, and applications.
- Blink processed a maximum of 1.718 billion data records (equal to the data volume of 1.2 million Xinhua dictionaries) per second.
- MaxCompute processed more than 500 PB data in a single day, smoothly supporting 120,000 deals per second at peak hours.
- Alibaba Cloud Security provided tens of millions of risk identification services for customers on the cloud and exported Anti-DDoS Premium to the world to safeguard global business.
These technologies made it easier for the entire system to deal with the traffic peak. The GMV hit RMB 213.5 billion on that day and set a new record.
The emergence of IoT created more imagination space for the Double 11 Shopping Festival.
On the consumer side, the IoT technology is creating a new tracking economy while serving New Retail. The technology ensures that Tmall imported goods are traceable. Consumers can view tracking information about imported goods and ocean shipping information in real time.
On the manufacturer side, the IoT technology helps Tmall brand clothing manufacturers receive orders, place orders, and stock goods digitally, and produce goods in a personalized and flexible way. The delivery accuracy rate is nearly 100%. The full links of agricultural production, transportation, and sales are upgraded.
From online to offline, from manufacturing to logistics and distribution, from China to other countries or regions, Alibaba Cloud’s technical capabilities have been extended to all walks of life.
With only 27 brands involved, the first Double 11 Shopping Festival in 2009 brought in only several millions of USD. Within ten years, the value had exploded to over $30 billion USD. The shopping festival has become a model for future business practice and the largest ground for testing new technologies. These new technologies will gradually become basic capabilities of the whole society, thereby promoting global social collaboration.