By Wang Jianshuang, nicknamed Jianxin at Alibaba.
In the Internet world, the only real constant is change. As technical engineers working in the e-commerce space, we often need to meet the following requirements: Notify an online shopper on our platform of price reductions for items that the shopper has favorited to help the shopper make a transaction, guide a new shopper or one who has not made a transaction within 90 days to actively chat with a seller after the shopper browses multiple items, or send a red packet to such shopper to help him or her make a transaction.
The essential logic of these requirements is as follows: collect and analyze shopper behavior in real time, perform rule-based computing, and accurately reach the shoppers, or users, that meet the rules that are set. In the normal development mode, it can be difficult to meet these requirements. Therefore, at Alibaba we developed the Omega system to meet these requirements. The Omega system is composed of the following subsystems:
We have described the behavior collection center and CEP rule center in previous articles. In this article, we will go over how the user outreach subsystem was designed, how policies can be flexibly configured, and how users can be reached accurately.
System Design: Logical Architecture
For ease of understanding, let’s briefly review the logical architecture of the Omega system. The Omega system is split based on the principle of high cohesion and low coupling. Each part is an independent and complete system and all the parts can be assembled to provide services.
- The first layer is the user behavior collection center. This layer cleans data into structured user behavior data through the MTOP (application gateway) API requested on the collection terminal and user behavior tracking on the user terminal (UT).
- The second layer is the CEP rule computing center. This layer generates Blink (Flink) stream processing tasks by parsing Domain Specific Language (DSL) and outputs users who meet the rules.
- The third layer is the user outreach center. This layer defines outreach policies and channels and delivers the policies to users in real time.
These three layers are interconnected and can provide services independently, or jointly undertake external business. At present, they are undertaking businesses related to user growth, games, and security.
Let’s take user growth as an example. When it comes to user experience, operation personnel help to guide users to complete transaction activities by using reasonable policy combinations and get users to experience an “Aha moment” through the product format. These policies may be rights disclosure, point of presence, and real-time push in the UT, or push, SMS, and outbound calls outside of the UT. The Omega system provides a technical solution for policy orchestration that supports long-term operation by integrating the active and passive outreach channels in and outside the UT and taking the real-time status of users as the most important information.
The outreach process itself is relatively straightforward. We divide the process into multiple small nodes and combine the nodes through configuration to ensure that each node can be plugged and swapped. The overall process of the user outreach system is as follows:
- Receive CEP rule computing results, including the rule name and users that meet the rule.
- The action routing layer queries the list of all actions that subscribe to a rule by the rule name.
- The action filtering layer filters the list of valid actions based on certain policies. The filtering policies include the blacklist and whitelist, phased release, group, and fatigue.
- The action delivery layer executes actions based on the configured policies. This layer can reach users through common methods such as sending push notifications and SMS messages, calling other business systems such as the security system to impose penalties, or delivering actions to the UT for execution.
- Track relevant information according to common protocols after actions are executed to facilitate subsequent data statistics.
Reaching users is the last step of the Omega system process, which requires the encapsulation of sufficient common outreach capabilities to ensure real-time and effective outreach. Otherwise, the user experience may be compromised. The following section describes the detailed design of the user outreach system, which ensures that outreach policies can be flexibly configured for assembly and plugging to achieve real-time outreach.
Note that MetaQ is an internal Message Queue (MQ) framework of Alibaba. The High-speed Service Framework (HSF) is a Remote Procedure Call (RPC) framework.
The user outreach center aims to provide independent services and supports flexible and pluggable configurations and accurate reach policies. Therefore, the design of the user reach center focuses on reducing external dependencies. It uses MQ to reduce direct dependencies and coupling of external systems, defines the functional boundaries of each internal sub-module, and combines the sub-modules through configuration.
The user outreach center is mainly used to maintain outreach policies and encapsulate standard outreach capabilities. It comprises the following parts:
- Input data source: The user outreach center can receive the computing results from the upper-layer rule center, or be triggered by external business systems.
- Outreach materials: These materials include text and images that are maintained in the cloud cast system, which is Xianyu’s material management system. In the future, offline data will be used to supplement more fine-grained information, such as user profiles and item data.
- Action routing layer: This layer maintains the subscription relationship between actions and rules, including the validity and priority of the subscriptions.
- Action filtering layer: This layer is designed as a responsibility chain. The filters are mutually independent and can be dynamically plugged and flexibly configured.
- Action implementation layer: This layer encapsulates the implementation of all common outreach capabilities, which include the cloud and client at present. In the future, Function-as-a-Service (FaaS) can be used to provide flexible and fast action launch. To ensure the real-time execution of actions on a client, we maintain dedicated a long-chain channel with the client. We performed specific optimizations to improve the data transmission speed and outreach rate of the channel, focusing on ensuring users could be reached on the client.
After an action reaches a user, the action will record data based on the unified tracking protocol. In the future, we will organize the tracking data reporting and data development process to reduce data development costs and make it easier for the business team to check the experimental results of actions and the experimental attributions.
Since the user outreach center was launched, it has undertaken multiple businesses based on configurations, such as Xianyu’s JinLin Game for the last Double 11 Shopping Festival, user growth, house rental, and leasing. After the operation personnel flexibly configured policies and benefits and accurately delivered them to users in real time, the following results were observed:
- Outreach accuracy to target groups was significantly improved.
- The Golden Scale Game achieved a latency less than one second.
- Operation tools to completely open up development resources have been created.
As the biggest event on Alibaba’s e-commerce platforms, the Double 11 Shopping Festival requires a striking high level of performance and number of queries per second to achieve the best user experience. The success of Double 11, time and time again, fully verified the performance and real-time outreach capabilities of the Omega system, and especially the user outreach center. When platform shoppers are pushed deals on items they have browsed, the final click-through rate (CTR) was significantly higher than that in offline scenarios.
The Omega system is a highly abstract solution for scenarios that have high real-time requirements, are directed by operations, and require fast experiments. Adhering to this concept, the user outreach center encapsulates various common outreach capabilities and supports filters that can be flexibly plugged. In addition, standard tracking protocols are designed to support fast business experiments and data attribution analysis. In the future, we will support standard access for offline profile data and standard analysis of data loading to integrate upstream and downstream business data and implement a closed-loop process. We welcome any questions or comments you may have.