The Power of AI: Why Taobao Knows Online Shoppers Better Than They Know Themselves
By Ou Wenwu, nicknamed Santong at Alibaba.
Today, recommendation systems, also known as recommender systems, have easily become one of the most important traffic portals of modern e-commerce platforms and really any platform with a high volume of content and users. In short, a recommendation system is a kind of sophisticated filtering system that can predict the preferences of users on a digital platform.
These AI-driven systems play off the idea that it’s only those who know customers better than they know themselves that can take the initiative in the era of new retail. Taobao Mobile’s recommendation system is one of Taobao’s largest traffic portals and transaction channels and is easily one of the most sophisticated systems of its kind. With a quick scroll through Taobao, it’s not hard to see that this system involves some of the most complex business formats and scenario-oriented technologies ever seen in e-commerce.
In this post, we will be taking a deep dive into how Taobao Mobile’s recommendation system was developed from the ground up. This article is based on the content that Ou Wenwu, a senior algorithm expert from Alibaba’s search and recommendation division presented at a conference seated with some of the biggest names in big data in China.
Taobao Mobile’s Recommendation System
The rapid development of Taobao Mobile Recommendation System originated from Alibaba’s “All in Wireless” strategy proposed back in 2014. In the wireless age, smart phones have come to more or less conquer the Internet; with this this revolution in hardware came many changes. For one, smart phone screens are much smaller than their desktop and laptop counterparts, and users cannot open multiple browser windows at the same time on a smart phone, changing how users could interact with web applications. To adapt to these changes, Taobao’s mobile team developed a system of personalized recommendations, through its now highly sophisticated recommendation system, to improve each user’s browsing efficiency on their mobile devices. After years of research and development, this system of recommendations has become the largest traffic portal on Taobao’s mobile app. The app now serving hundreds of millions of users every day, and recommendations are a huge part of the app experience, being only second to search as one of the most commonly used ways to navigate the mobile app.
Today, in the Taobao app, recommendations are not only for items to shop for, but also for platform live broadcasts, stores, brands, user-generated content (UGC) and professionally generated content (PGC), among other things. Taobao Mobile provides a wide range of recommendations for hundreds of specific scenarios. Different to search, where users proactively express what they are searching for, the recommendation system rarely actively interacts with users. Rather, it interacts with users only through backend algorithm models. Therefore, since its inception, recommendation has been a project that has been built using big data and AI.
Compared with other recommendation products, Taobao Mobile boasts the following advantages:
- Involved in a Shopper’s Decision-Making Cycle: The main value of Taobao Mobile’s Recommendation System is its uncanny ability to identify the potential wants and needs of users and help them make purchase decisions even before they may know what they want themselves. A user’s decision-making cycle during an online shopping experience can be a relatively long and complex process, which involves the discovery of new needs, the acquisition of new and related information, as well as things like item comparison and order placement decisions. Therefore, e-commerce recommendation systems must work with this process, making recommendation decisions based on the user’s behavior.
- Integrates a Feature of Timeliness: Timeliness is also considered into the underlying system; a system is relies heavily on Flink. Things that are purchased on Taobao are often purchased to satisfy needs of relatively low frequency and are for purchases that may only be valid for a short window of time. For example, people generally only buy a smart phone once every two to three years, and the decision-making cycle may just spread across a few hours, or just several days’ time at most. Therefore, a recommendation system must be extremely time-sensitive when it comes to providing for these needs. It needs to quickly understand and capture the real-time interests of users and explore their unknown needs as well.
- Developed around a Complicated User Structure: Users on Taobao Mobile are not only users with registered accounts. App users also include users that aren’t logged in, as well as new users, users that rarely active on the platform, and users that are no longer on the platform. Given all of these situations, the Taobao team had formulate differentiated recommendation policies and optimize the recommendation model accordingly.
- Covers Multiple Scenarios: Taobao Mobile Recommendation System covers hundreds of specific scenarios on the platform. Independent optimize for each scenario is far from realistic, as each scenario has different conditions that require different hyperparameters. Moreover, manual optimization is also unrealistic. Therefore, on Taobao, transfer learning and automatic hyperparameter learning has been paramount.
- Involves Multiple Targets and Types:
The following slide shows the technical framework of Taobao Mobile’s Recommendation System. During the 2019 Double 11 Shopping Festival, Alibaba migrated all of its business operations to the cloud. Therefore, the technical architecture of Taobao Mobile Recommendation System also took root on the cloud. The basic components of recommendation include recommendation algorithms and models, raw logs and features obtained by processing log data, and offline computing and service capabilities, such as vector search, machine learning platforms, and online sorting. This year, in addition to our models and systems on the cloud we also deployed end-computing deep learning models to achieve a collaborative system of computing between the cloud and end devices.
Now, let’s talk about data, infrastructure, and algorithm models.
Taobao Mobile uses several types of recommendation data, including descriptive data such as user profiles, relational data such as bipartite graphs or sparse matrices, behavior sequences, and graph data. The user behavior sequence-based recommendation model is the model most widely used for item recommendations in Taobao Mobile. The graph model has developed rapidly over the past two years because the sequence model is usually only suitable for homogeneous data. In Taobao Mobile, there are many types of user behaviors, such as video views and keyword searches. Therefore, we can use graph embedding and other technologies to align heterogeneous graph data or merge features.
A data sample contains two elements: a label and a feature. There are several types of labels used in Taobao Mobile’s Recommendation System, including exposures, clicks, transactions, and additional purchases. Recommendations use many features, such as a user’s own features, user context features, item features, and the combinations of two features. We create sample tables by joining user features and behavior logs. These tables are stored as sparse matrices. Generally, tables are created by a day or by specific time segment, and sample generation occupies a large portion of offline computing resources.
There are three main offline computing modes: batch data processing, streaming data processing, and interactive query. MapReduce is a typical example of batch data processing. Featuring high latency but strong parallelism, MapReduce is suitable for offline data processing, such as hourly or daily feature computing, sample processing, and offline reporting. Stream computing features low data latency. Therefore, it is suitable for event processing, such as user clicks, preference prediction, sample processing for online learning, and reports, the key being that all of these are processed in real time. Interactive query is mainly used for data visualization and report analysis.
Model training has three main modes: batch learning, incremental learning, and online learning. Here, batch learning means that the model initialization starts from scratch. If the log volume is small and the model is simple and does not need to be frequently updated, you can regularly train and update the model with logs. However, when you are faced with a large volume of logs and many model parameters, batch learning can consume a large amount of computing resources and may even take several days to process, which is not cost effective. In contrast, incremental learning is usually performed based on historical model parameters. Given this, we can incrementally train models and deploy them online by using hourly or daily logs to reduce resource consumption and support a higher model update frequency. If the model is relatively more time-sensitive, online learning must be used to update the model in real time with samples in a matter of just a few seconds. The main difference between online learning and incremental learning is that the two rely on different data streams. Online learning, in particular, usually requires samples generated in real time by a stream-computing framework.
Host resources are always insufficient for training. Therefore, we need to implement training optimization to train better models faster and in a way that requires less processing and less data. Here are some ways to accelerate the training process:
- Hot start: When a model needs to be continuously upgraded and optimized by adding features or modifying the network structure, some of the repaired model parameters are initial values, and therefore the model needs to be retrained. During hot start, a small number of samples are used to converge the model when only some model parameters are modified.
- Transfer learning: Taobao Mobile Recommendation System has many recommendation scenarios. Some only have a few logs and cannot implement large-scale model training. To resolve this issue, we can use transfer learning based on larger scenarios with many samples.
- Knowledge distillation: Taobao Mobile uses knowledge distillation for cascade model learning. This can increase the features and precision of refined ranking models. That is, through distilling refined and course ranking features, we can increase the precision of course ranking models. This approach can also be used to optimize model performance.
- Low precision, quantification, and pruning: As models become more complex, their online storage and prediction costs increase exponentially. These methods reduce the model storage space and prediction speed. In addition, on-device models usually must meet strict size requirements.
An End-to-end Closed Loop
Taobao Mobile’s Recommendation System involves large log volumes and complex feature sources. Therefore, minor offline and online variations may cause sample errors or inconsistencies in online and offline features and models. This also affects iteration efficiency and can also lead to errors down the line. To resolve this issue, we built an end-to-end development framework that abstracts logs, features, and samples to reduce manual development costs and the likelihood of errors. In addition, debug and data visualization tools are nested in the framework to improve troubleshooting efficiency. At present, we have formed a closed loop for Taobao Mobile search recommendations, which covers raw log collection, feature extraction and model training verification, model release, online deployment, and real-time log collection. This improves the efficiency of the overall model iteration process.
Cloud and End Computing
With the development of 5G and IoT, data volumes will rapidly increase. This brings about many questions. For instance, with this new change, should we continue to centrally store and compute data on the cloud? Or can we move some storage and computation processes to end devices? To answer this question, consider this. Compared with the cloud, computing on end devices, modern smart phones, have several major operational advantages. The two most important advantages are low latency and a high level of security. And, these are important too, as countries around the world are pushing for stricter, more stringent requirements for the protection and security of personal data and digital privacy. Therefore, it’s equally important that we consider how we can output personalized recommendations without sending personal data to the cloud.
Cloud-Device Collaborative Computing
Alibaba has tried and implemented many different approaches to cloud-device collaborative computing. These approaches include implementing collaborative inference between the cloud and end devices. This particular approach is beneficial for a number of reasons. Modern smart phones store a lot of user behavior information, including the user’s scroll speed and window exposure durations. We can use an on-device user behavior pattern perception model, and implement on-device decision-making. For instance, we can use this information to predict the time the user will most likely close the app. We can also make changes to the app experience or relevant policies to increase the user’s browsing depth before the user exits the app.
Besides this sort of approach, Taobao Mobile also implements an on-device recommendation system. Currently, cloud-based recommendation outputs 20 items, but a smart phone can display only four to six recommendation results at a time. Therefore, before all the 20 results can be viewed, a smart phone would not initiate a new request to the cloud regardless of the operations performed by the user. Therefore, the recommendation results remain unchanged, causing the timeliness of personalized recommendations to reduce significantly. Therefore, our current approach is to put 100 results on the smart phone at a time and have the smart phone constantly update the results based on inference. In this way, the recommendations change at a much faster pace. In contrast, if all of these tasks were performed on the cloud, we would need to use thousands of more servers.
In addition to inference, cloud-device collaborative training is another possibility. Cloud-device collaborative training is a very important means of safeguarding a user’s privacy. Moreover, this is the only way we can avoid sending all raw user data to the cloud. With most of the training completed on smart phones, the cloud only has to process some user vectors, which cannot be used to get the original user data. As such, we can better ensure the privacy of our end users.
Recall Technology: Multi-Interest Network with Dynamic Routing (MIND)
A few years ago, everyone used Item2Vec-based recall and tag-based recall for collaborative recommendation filtering. Item2Vec-based recall is relatively simple and has real time results. For a long time, it was the leading recommendation technology. Then, matrix decomposition was introduced, and with that we began to recognize that Item2Vec-based recall still had some major problems. It has a hard time recommending items with a small number of exposures and clicks, meaning that recommended items were essentially the most popular items. In addition, Item2Vec-based recall views each click as an independent event, enable to create a global perception of a user. Currently, it is necessary to globally perceive and recall user behaviors and tags. Advancing from this starting point, we proposed a recall model based on behavior sequences. However, the problem with this method is that the interest of a real user is rarely focused on a single thing. In this model, single vector recall can usually only recall one category or point of interest. Therefore, it is difficult to express the varied needs of a user through deep learning. Alibaba solved this problem and published the solution in a paper presented at CIKM 2019. Currently, Taobao uses online multi-vectorization parallel recall.
The click-through rate (CTR) model used for Taobao Mobile’s Recommendation System has undergone several important changes. The first model was an FTRL- and LR-based model. This was a simple model that could support hundreds of billions of features. The second model was an explainable neural network (XNN) model, which performed embedding for discrete logistic regression (LR) features and introduced a multi-layer neural network. In addition, the model’s learning capabilities were enhanced by the introduction of new parameters. The third model is a self-attention CTR model, which is implemented based on graphs and user behavior sequences.
Recommendation Sequence Optimization: Generative Recommendations
Recommendations are generally based on scores. After scoring, a greedy sorting and discretization algorithm is performed. However, this method does not produce optimal results. The dependencies between results are not considered, so the results returned by the greedy algorithm are not optimal. Essentially, recommendation should be a set optimization process, rather than a sequence optimization process. Therefore, Taobao Mobile Recommendation System uses a generative sorting model. For more information, consider checking out our paper presented at KDD 2019.
Multi-target Equilibrium and Optimization
During recommendation, we often need to strike a balance among several different alternate goals. For example, we need to balance the browsing depth, clicks, and transaction metrics connected with item recommendations. The target dimensions are inconsistent, and therefore a globally unique solution cannot be easily realized. To optimize multiple targets at the same time or make reasonable trade-offs between multiple targets, we proposed a multi-target optimized sorting model based on Pareto efficiency. For more information about this, see our article presented at RecSys 2019.