How Does the Recommendation System Work on Tmall?

Types of Homepages

There are namely two types of Tmall homepages, or homepage scenarios that the user may come across, which are strikingly different structurally speaking. One is the entrance page homepage for promotion venues on Tmall, which the user may see during major shopping events such as during the Double 11 and Double 12 shopping festivals, and other is the daily channels homepage, which is what the user would typically see when he or she opens the mobile app. Examples of these two types are shown in Figure 1. The figure on the left shows entrances to the main and industrial venues. The main venue entrance recommends users to seven products with a dynamic carousel that consists of three products in the middle, which distributes traffic to the main venue. Over tens of millions of unique visitors (UVs) have been redirected to this kind of main venue page. The industrial venue entrance (shown below) recommends four personalized venues to distribute traffic to tens of thousands of venues. The figure on the right (below) shows daily channels, which includes Flash Sales, Recommended Tmall Products, Juhuasuan (¾Û»®Ëã, Group Buys), Flash Discounts, and Recommended Channels. Through personalized product recommendations, the homepage distributes traffic to various channels to increase user loyalty and promote users to shop more on Tmall.

Figure 1. Tmall Homepage Types.

The Recommendation System Framework

The personalized recommendation system for the Tmall homepage consists of the (1) recall module, (2) sorting module, and (3) mechanism module, all of which we will discuss more in the following sections in much greater detail. In short, the recall module retrieves top-k candidate products that users may be interested in from all products. The sorting module focuses on the product’s click-through rate (CTR) evaluation. And the mechanism module controls traffic, optimizes user experience, adjusts the strategies of the system algorithm, and sorts products. Next, the recommendation system also uses a number of new technologies, as discussed above. The following topics shows these key technologies in detail.

Figure 2. The Framework of the Recommendation System for Tmall Homepages.

The Recall Module

Ranki2i

Item-CF is the most widely used recall algorithm. It calculates the similarity (simScore) between products based on the frequency that both products are concurrently clicked to obtain an i2i table. Then, it queries the i2i table based on users’ triggers to expand products that users are interested in. Although the Item-CF algorithm is simple, it must be optimized based on actual business scenarios for better effects. Noise data clearance, such as clearing crawlers and fake orders, reasonable selection of the time window for calculating similarity between products, time attenuation introduction, considering only product pairs of the same category, normalization, truncation, and discretization can significantly optimize the Item-CF algorithm.

Graph Embedding

Graph embedding is a machine learning technology that projects complex networks to low-dimensional space. Typically, this technology vectorizes network nodes and ensures that the vector similarity between nodes is close to the multi-dimensional similarity between original nodes in terms of the network structure, neighborship, and metadata.

Figure 3. Graph Embedding.

MIND

Multi-Interest Network with Dynamic Routing (MIND) is a vector recall method proposed by our team. It constructs multiple user interest vectors in the same vector space as the product vector to represent users’ interests. Then, it retrieves the top K neighboring product vectors based on these interest vectors to conclude the top K products that users are interested in.

Figure 4. The MIND Model.

Retargeting

Retargeting is a strategy that recommends products that users have clicked, added to favorites, or purchased before. In e-commerce recommendation systems, users’ behaviors include browsing, clicking, adding to favorites, additional purchasing, and ordering. Even though we hope that users’ behaviors can eventually be converted to transactions, the reality is a different story. When a user initiates upstream behaviors for ordering, the user may not complete the transaction for various reasons. It does not mean that the user is not interested in the product. When the user visits Tmall again, we will understand and identify the user’s real intention based on the user’s prior behaviors, recommend products that meet the user’s intention again, and finally urge the user to place an order.

Crowd-Based Filtering

The preceding recall strategies can recall products that users are interested in based on the user’s historical behavior. However, the number of recalled products is limited for offline and cold-started users. Crowd-based filtering is a substitute recall strategy that recommends products based on coarse-grained attributes of user groups, such as the gender, age, and recipient city. Then, it selects the top K products with a high CTR for each group based on the behavioral data of each group as the products that the group is interested in.

Convergence and Modulation

To combine the advantages of different recall strategies and improve the diversity and coverage of candidate sets, we converged products that are recalled by the preceding recall strategies. During convergence, we properly modulated the recall ratios of different recall algorithms based on their historical recall results and throttling requirements.

Sorting Module

Sorting Features

Sorting features greatly affect the sorting effect. The sorting module of the recommendation system for the Tmall homepage has the following types of features:

  • User profile features: indicate users’ basic features, such as the gender, age, city, and purchasing power.
  • Item features: indicate product features, such as the product ID, category ID, store ID, and tag.
  • Context features: indicate contextual features, such as the matching type, location, and page number.
  • Cross features: indicate crossing features, such as the intersection of the user and product features.
  • Sequence item features: indicate users’ behavioral features for products, such as the list of clicked products, list of clicked categories, and positional biases.

Sorting Samples

Sorting samples also affect the sorting effect. Sorting samples come from exposure and click logs generated in scenarios. The following methods are used to improve the sorting effect: Clean scenario logs and remove noise data from logs. Accurately calculate scenario-specific active and blacklisted users in quasi real time and retain users who are interested in the scenarios. Filter out data related to cheating behaviors, such as crawling and scalping. Filter out abnormal behavior logs in special time segments, for example, ordering at 00:00:00 and red packet rain periods.

Sorting Models

Classic Deep Sorting Model

The Wide and Deep Learning (WDL) model proposed by Google builds the basic framework of the deep sorting model.

Behavior Sequence Transformer (BST) Model

Models such as DeepFM, PNN, DCN, and Deep ResNet focus on how to better use ID and bias features to approach the model upper limit that they can reach. These models seldom explore how to effectively use sequence features. DIN and other models further explore sequence feature modeling. They use target items to attend sequence features and then perform weighted sum pooling. Even though they properly present the correlation between the scoring items and the user behavior sequences, the correlation between user behavior sequences is not abstracted.

Figure 5. The Behavior Sequence Transformer (BST) Model.

Mechanism Module

Visualized Experience Optimization

Standard Category Expansion Based on the Knowledge Graph

Restricted by various factors, product categories on Taobao and Tmall are refined and do not comply with users’ subjective product classification in recommendation scenarios. In collaboration with the knowledge graph team, we have established a standard category system to aggregate similar leaf categories based on semantics and scenario features and applied it to filter purchase categories and expand categories when the categories are scattered.

Figure 6. Standard category system.

Similar Image Detection System Based on Image Fingerprints

Taobao has a huge amount of product materials, and there are endless similar pictures among these products. This similarity does not belong to product attributes and cannot be identified through semantic information, such as the title and category of the products. We have developed the similar image detection system to detect the similarity between product images.

Figure 7. Similar-Image Detection System.

Multidimensional Discretization

The Tmall homepage consists of venue entrances and daily channels. Venue entrances include entrances to the main and industrial venues, while daily channels include Flash Sale, Recommended Tmall Products, Juhuasuan, Flash Discount, and Recommended Channels. Each channel has independent and similar product materials and may duplicate with each other to some extent. These channels may provide similar recommendation results if the similarity is not restricted. As a result, the limited and precious homepage space is not fully utilized, the user experience is degraded, and the cultivation of users’ interests to scenarios are negatively affected. We have designed a variety of discretization solutions to jointly discretize materials recommended by various channels on the homepage from multiple dimensions (such as products, standard categories, brands, venues, and similar images) to ensure diverse recommendation results.

Template-Based Exposure Filtering in Real Time

Because the Tmall homepage is the first screen in the Tmall app, it is exposed each time a user opens the app. However, multiple invalid exposures exist. For example, users directly access the search channel or shopping cart or seize red packets or coupons in promotional periods. In these invalid exposures, users are not interested in the content of the Tmall homepage. The common method of recording fake exposed products and using them to filter real-time exposures is far too strict for scenarios where the invalid exposure rate is high on the homepage, and will greatly weaken the recommendation effect. To address this issue, we have designed a template-based and real-time exposure filtering method. This method recommends multiple templates to a user at a time, records the ith template previously viewed by the user, and displays the (i+1)th template for the user. If the user has new behaviors, the recommended content in the template will also be updated.

Personalized User Purchase Category Filtering

In the past, the recommendation system was always complained for providing recommendations to purchased products. To resolve this problem, the user’s purchase categories need to be filtered properly. Because leaf categories have different purchase periods and users have different category purchase periods, the user’s personalized requirements for filtering different purchase categories need to be considered. Purchase filtering is a common problem in all recommendation scenarios. Partnered with the engineering team, we have launched a unified global purchase filtering service to customize a purchase blocking period for each category. Based on the user’s recent purchase behavior, a real-time purchase filtering category is maintained for each user. If a user clicks a category multiple times during the purchase blocking period, it indicates that the user is still interested in and may purchase a product of that category. In this case, the category will be unblocked. By applying the purchase filtering service to the Tmall homepage, the problem of recommendations to purchased products is greatly mitigated.

Conclusion

In this article, we have looked into the Tmall homepage recommendation system in terms of the various algorithms it uses. We have specifically looked into how Tmall uses the graph embedding, transformer, deep learning, knowledge graph, and user experience modeling technologies to build an advanced recommendation system in terms of recall, sorting, and recommendation mechanism. A complete recommendation system is complex. To build a Tmall homepage dedicated for you, we need to cooperate with the product, engineering, and operations personnel. At Alibaba, we will continue to accumulate experience, improve our solutions, and deepen our technologies to create better personalized services for enhanced personalized recommendation in the future.

Original Source

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com