By Jing Shaoqiang, nicknamed Shaoqiang at Alibaba.
The replacement of feature phones, or “dumb phones,” by smartphones was almost a decade ago, but we are still feeling the effects of it today. This is in large because this transition had ushered in the mobile Internet era and the rise of the “mobile app”. In the US, some major apps are Facebook, Instagram, Snapchat and Twitter. And, in China, some of the most notable apps are the social feed and posting app Weibo, the social messaging “super” app WeChat, the algorithm-powered news app TopBuzz, and the video sharing app Kuaishou. These products have grown rapidly over the last half decade with the exploding popularity and saturation of the smartphone market.
One thing that all of the apps I have mentioned above have in common is that they depend on feed streams, which flow from the top to the bottom of your smart phone screen. These feed streams are usually, but not always, time-based, being based on a time frame and are suitable for browsing content on mobile devices. These products have easily replaced their previous-generation counterparts, quickly seizing the entirety of the remaining market share.
Feed streaming, fundamentally speaking, consists of feeds and streams, and it works by continuously delivering feeds to the destination of your smart phone app. In informatics, a feed is an information unit, such as a post on WeChat Moments or Weibo, a review post, or a short video. The difference, of course, is that feed streams are constantly updated information units. Users browse new feeds that are continuously pushed by publishers on mobile devices.
Currently, the most popular feed streams in China are the information flows of Weibo, WeChat Moments, TopBuzz (Toutiao), and video flows of Kuaishou and TikTok (known in China as Douyin). These also come with various private messages or notification systems, of course, too. These feed streams are also called feed stream systems. The remaining sections of this article will describe the artchiture and how you can design the architecture of a feed stream system.
The Features of a Feed Stream System
A feed stream is essentially a data stream that delivers information units from N publishers to M recipients based on the following relationships detailed in the infographic below.
A feed stream system is a data stream system, with core-driven data, and data is divided into three types at the data layer:
- Publisher Data: A publisher produces, organizes and displays data, such as a personal page on Weibo and a photo album on WeChat Moments.
- Following Relationship: This indicates the relationship between individuals in the system. A user may follow another user on Weibo, which is a one-way stream. A user may add another user as a friend on WeChat Moments, which is typically a two-way stream. The information unit published by a publisher is always a one-way stream.
- Receiver Data: Data from different publishers is organized in a certain order (typically by time), such as the homepage of Weibo or WeChat Moments. Measure data by popularity, which is sorted in descending order, with the most popular data ranked on top. The data value is also proportional to data popularity.
The three data types are defined as follows:
- Repository: Permanently stores the data of publishers.
- Following Relationship Table: Permanently stores user relationships.
- Synchronization Database: Stores the time-based popularity data of receivers. Only the data from the most recent period is retained.
Consider the following product-related factors while designing a feed stream system.
- The Number of Users: The difficulty and focus of design vary depending on the number of users, which ranges from 100 thousand to tens of millions and even billions.
- Following Relationship (One-way or Two-way): VIP accounts only exist in one-way relationships.
Also, consider the following factors while selecting a data storage system.
- How do I implement meta and feed content search?
- A feed stream system may not require the search feature, but a feed stream product must provide a search feature to facilitate information discovery and improve the user retention rate.
- Are feed streams sorted by time or other scores, such as personal preference?
- In a two-way relationship that closely binds users, feed streams are almost always sorted by time. Even an empty or valueless message that is posted by a followed account is delivered.
- A one-way relationship may include VIP accounts, whose number of followers may cover, at most, all system users. Some products automatically add product owners to users’ follow lists. In this case, the product owner is the largest VIP account whose number of followers covers all users of the product.
Feed Stream System Design
Now let’s dive right into how one would go about designing a feed stream system from the top down.
The first step is identifying the type of target product, which may be one of the following:
- Microblogging like Weibo or Twitter
- Friend’s sharing like WeChat Moments, Facebook and Instagram
- Short video sharing like TikTok, IGTV, and Snapchat
- Private messages like WeChat, WhatsApp, KaKaoTalk, and Line
Refer to the following table to compare the product types.
Product typeFollowing relationshipAvailability of VIP accountsTimelinessSort byMicrobloggingOne-wayYesSeconds to minutesTimeShort video sharing, such as TikTokOne-way or N/AYesSeconds to minutesRecommendationFriend’s sharing, such as WeChat MomentsTwo-wayN/ASecondsTimePrivate messageTwo-wayN/ASecondsTime
This comparison involves the core features of the product types. For example, a two-way relationship is formed between two users who follow each other, but it is only a supplement feature of microblogging.
The product types may distinguish on the basis of the following two criteria.
- One-Way or Two-Way Following Relationship
- A one-way relationship may include VIP accounts and requires lower timeliness, such as every minute updates.
- In a two-way relationship, users add each other as friends. The friend list of each user is limited as it is unlikely that the user will add tens of millions of other users as friends. The two-way relationship is more targeted and requires a higher level of timeliness, with updates at the second-level.
- Sort by Time or Recommendation
- Most feed streams are sorted by time, which is the most acceptable sorting criterion for users.
- Feed streams are sorted by recommendation when data is pushed to users based on their preferences. This situation ignores users’ follow lists and considers that users follow all other users of the product, such as TikTok and TopBuzz.
After determining the product type, determine the system design target which refers to the maximum users that the system supports, whether this is 100 thousand, 1 million, 10 million, or 100 million.
System design is simple when the number of users is small. The following section describes how to design a feed stream system that supports hundreds of millions of users. To support such a massive user scale, you need to select subsystems with sufficient scalability, availability, and reliability. In a large system, an unreliable subsystem may affect the entire system.
Storage is the most critical component and is the same for all synchronization modes. User messages are stored in a repository. A repository must provide the following three functions:
- Reliable storage of user-sent messages without loss. Otherwise, users wouldn’t find their posts that are published in their friend feeds.
- Reading of all messages published by a user, for example, on the user’s personal page.
- Permanent storage of data.
Therefore, the repository has the following important features:
- Reliable storage of data without loss.
- Auto-scaling support to permanently store growing data.
The following table lists the two types of repositories available.
FeatureDistributed NoSQLRelational database (database splitting and table sharding)ReliabilityExtremely highHighScalabilityLinearTo be restructuredScale-out speedMillisecondN/ACommon systemTable Store and BigtableMySQL and PostgreSQL
- Distributed Unstructured (NoSQL) Database: These are much more reliable than relational databases. It is commonly regarded that data is safely stored in relational databases with a long development history and high maturity, whereas NoSQL hasn’t gained widespread trust as it has a short development history and limited usage. However, NoSQL stores more data with higher reliability, which is guaranteed by triple-replica storage. Currently, some cloud vendors use relational databases with improved reliability that are implemented in a similar way to NoSQL databases.
- Scalability: In a distributed NoSQL database, data is naturally distributed on multiple servers. When data on a server increases, data is automatically split in half, with half of the data being migrated to another server. This ensures the scalability of NoSQL databases. A relational database supports database splitting and table sharding during scale out.
Therefore, the following pointers summarize NoSQL functionality.
- MySQL relational databases are applicable to user-created databases that do not provide the NoSQL O&M capability and have a moderate data volume.
- Distributed unstructured databases (or NoSQL databases), such as Alibaba Cloud Tablestore, are applicable to user-created databases that are used by cloud services.
- Distributed NoSQL databases are also required to handle larger volumes of data.
Refer to the following table to understand the structural design of the repository table when an Alibaba Cloud Tablestore database is used.
Primary key columnFirst primary key columnSecond primary key columnProperty columnProperty columnColumn nameuser_idmessage_idcontentotherDescriptionThe ID of the message sender.The message ID that might be a timestamp.ContentOther content
The following figure shows the system architecture based on the selected repository type.
After determining the system scale, product type, and storage system type, select a synchronization mode from the following three choices:
- Push Mode: This mode immediately pushes a message to the receiver. If the receiver is offline, then the message is stored in a database called the synchronization database. If you are sending a message to multiple followers, then the message is replicated based on the number of receivers. This system therefore requires the synchronization database to provide a robust and reliable write capability. After sending a message to the receiver’s inbox, the receiver reads the message from the inbox only once. Therefore, the read process initiates a small number of read requests and queries per second. In summary, the synchronization database only needs to provide a robust write capability in push mode.
- Pull Mode: This mode writes a sent message to the sender’s outbox rather than sending it to followers immediately. Once followers go online, they read the message from the sender’s outbox. A message is written only once but read as many times as there are followers. The read/write ratio in pull mode is opposite to the push mode and requires a robust read capability. The pull mode is often the primary choice for designing a feed stream system due to its adaptability, providing a better user experience, but it also has many pain points. The biggest pain point is that it requires recording the last read message for each follower. If a user has 1,000 followers, 1,000 read position records are maintained for this user after the user sends a message. The number of records is proportional to the number of followers and therefore will far exceed the number of users. As such, the pull mode is only applicable to scenarios with a small data volume.
- Push-Pull Mode: In a one-way relationship with VIP accounts, a message may be sent to millions of followers in push mode. However, more than half of these followers may be zombie users who never go online. This causes a resource waste. Moreover, the system architecture is complex in pull mode, with a large number of read position records, which may become the primary point of failure as users increase. However, in push-pull mode, the messages of most users are subjected to the pull mode, and only the messages of VIP accounts are subjected to the push mode. This controls resource waste and simplifies the overall system design.
However, the push-pull mode requires a system design that is more complex than that in push mode.
The following table compares the three modes:
TypePush modePull modePush-pull modePush modeHighN/AMediumPull modeN/AHighMediumUser read latencyMillisecondSecondsSecondsRead/Write ratio1:9999:1–50:50System requirementRobust write capabilityRobust read capabilityModerate read and write capabilitiesCommon systemDistributed NoSQL with the LSM architecture, such as Table Store and BigtableCache systems or search systems, such as Redis and Memcached (applicable to sorting by recommendation)Combination of the two system typesArchitecture complexitySimpleComplexMore complex
Now here’s a quick summary of the scenarios and modes of synchronization that we discussed above:
- The push mode is applicable to two-way relationships.
- The push mode is also applicable to one-way relationships with less than 10 million users.
- The push-pull mode is applicable to one-way relationships, with more than 10 million users. The push-pull mode directly evolves from the push mode, with no additional rework.
- Do not use the pull mode only.
- If you are building a startup business, apply the push mode to system design, and then perform product verification and iteration. Consider upgrading to the push-pull mode when the number of your customers increases to 10 million.
- The system architecture is totally different in scenarios that use sorting by recommendation. We will describe this in subsequent articles.
If you are using an Alibaba Cloud Tablestore database, refer to the following structural design of the synchronization database table:
Primary key columnFirst primary key columnSecond primary key columnProperty columnProperty columnProperty columnColumn nameuser_idsequence_idsender_idmessage_idotherDescriptionThe ID of the message receiver.The ID of the message, which may be in the format of timestamp+send_user_id, or use the auto-increment column of Table Store.The ID of the message sender.The value of the message_id column in store_table, that is, the message ID. You can query the message content in store_table based on the sender_id and message_id.Other message content that is not included in the synchronization database.
A complete feed stream system includes not only the basic synchronization and storage functions but also metadata. The following sections describe how metadata is processed.
The metadata in a feed stream system includes:
- User details and user list
- Following relationships or friendships
- Session push pool.
Next, we will introduce the three types of metadata individually.
User Details and User List
User details include custom user properties and attached system properties, which are queried by the user ID. Store user details can in a distributed NoSQL system or a relational database.
If you are using Alibaba Cloud’s NoSQL database product Tablestore, consider the structural design of the user details given below:
Primary key sequenceFirst primary key columnProperty column-1Property column-2……Fielduser_idnick_namegenderotherDescriptionThe primary key column used to uniquely identify a user.User nickname, which is a custom user property.User gender, which is a custom user property.Other properties, including the custom user property column and the attached system property column. Table Store is free of schema, in which a new column can be added to any row without impact on the existing data.
Following Relationship or Friendship
This section describes how to store relationships. Design the system to support the query of the following lists, follower lists, and friends lists through the indexing capability. Such querying involves multiple property columns. The storage system might be a relational database or unstructured database.
- If the data volume is small, select an available relational database, such as MySQL.
- If the data volume is large, consider the following two options:
- Use a distributed relational database to support distributed transactions.
- Use a system with the indexing feature, such as Alibaba Cloud Tablestore, for easy operations, higher throughput, and improved scalability.
Refer to the following relationship table for the structural design of Tablestore.
Primary key sequenceFirst primary key columnFirst primary key columnProperty columnProperty columnTable fielduser_idfollow_user_idtimestampotherDescriptionThe user ID.The follower ID.The follow time.Other property columns.
Consider the following search index structure:
Take note of the following critical points during a query:
- To query the follower list of a user, query the invariable
user_idby using TermQuery, and sort the followers by timestamp.
- To query the following list of a user, query the invariable
follow_user_idby using TermQuery and sort the followed accounts by timestamp.
- Query the data written to a table by using search indexes with a latency of five to ten seconds. The latency will be optimized to be less than two seconds. Also, you may use Global indexes for queries.
Session Push Pool
With session push pool, receivers will perceive incoming new messages from senders through periodic refreshing at the client side. In this case, the system is burdened with read requests as the number of clients increases. This causes a query storm when a platform publishes breaking news, with many hibernating devices logging on, resulting in a device online rate that far exceeds the usual 20% to 30% rate. As such, the system may stop responding and become unavailable to all users.
Therefore, one solution is the maintenance of a session push pool on the server to record a list of online users. After User A sends a message to User B, the server writes the message to the repository and synchronization database and notifies User B of the new message in User B’s session that is stored in the session push pool. The message is pushed to the client when User B accesses the session to read the message.
Alternatively, a notification is pushed to the client to instruct them to pull the new message. The session push pool is used in synchronization and stored in the memory because the pool data is essentially metadata. The pool data must be persistently stored since it supports a single-key query. Therefore, you’ll want to store the pool data in a distributed NoSQL database or relational database, or the existing system.
Refer to the structural design of the session table below while using Tablestore to design your own.
Primary key column sequenceFirst primary key columnSecond primary key columnProperty columnColumn nameuser_iddevice_idlast_sequence_idDescriptionThe ID of the receiver.The ID of a device. One user may have multiple devices, which may have different read positions. The device ID is used to differentiate these read positions. This column can be ignored if the multi-client feature is not required.The latest message sequence ID that the receiver pushes to the client.
All types of feed streams, except private messages typically, support the comment feature. In essence, the comment features is similar to a repository but have an additional relationship with the commented message. Therefore, comments are grouped on the basis of commented messages. Comment querying is done within a specific range. The query method is simple and does not require the complex transactions and Join feature of a relational database. A distributed NoSQL database is suitable for storing comments.
Follow the steps below to select a storage method:
- Reuse the existing distributed NoSQL database, such as Tablestore.
- Select a relational database, such as ApsaraDB for MySQL, if no distributed NoSQL database is available.
- Consider the following structural design of the comment table while using Tablestore.
Primary key column sequenceFirst primary key columnSecond primary key columnProperty columnProperty columnProperty columnFieldmessage_idcomment_idcomment_contentreply_tootherDescriptionThe ID of the message that is posted to Weibo or WeChat Moments.The ID of a comment.The content of a comment.The user to which the response is returned.Other properties.
To search for comments, create search indexes for the table.
The likes feature gained popularity in recent years and is implemented similarly to the comment feature. The only difference is that the likes feature lacks an item that is available in the comment feature, so the two features support the same storage method.
The structural design of the likes table is the same as that of the comment table in case of Table Store
The following figure shows the system architecture that includes the metadata system:
The feed stream products require the search capability in the following scenarios:
- Searching for users in Weibo
- Searching for content in Weibo
- Searching for friends in WeChat
Searching for such content is based on a string match and only requires the word segmentation retrieval function rather than complex correlation algorithms.
Use the following two methods to implement content searching.
- Use a search engine to push content from the repository and user information table to the search system. The search engine is directly accessible to serve query requests.
- Use a database that supports full-text retrieval, such as the latest version of Alibaba Cloud ApsaraDB for MySQL, ApsaraDB for MongoDB, and Tablestore. Select a database on the basis of the following principles:
- If the repository is based on ApsaraDB for MySQL or Tablestore, then use either of the two systems.
- If you are using a search system and not going for ApsaraDB for MySQL or Tablestore, then reuse the search system. Do not add another search system in other scenarios. Otherwise, the system may become more complex.
Also, create search indexes for the corresponding table while using Tablestore.
- To support username search, create search indexes for
nick_nameto the text type with single word segmentation.
- To support searching for feed stream content, create search indexes for the repository table
store_table. This enables complex searches for feed stream content, such as conditional filtering and full-text retrieval.
The following figure shows the system architecture that supports the search function.
Currently, a feed stream system supports sorting by time and score, respectively. Common feed stream products, such as Weibo, WeChat Moments, and private messages, belong to the timeline type and prefer real-time performance over the quality of published content. You need to follow a user before viewing the content published by that user. The published content may include useless messages. Sorting by timeline is applicable to these products. The architecture described here belongs to the timeline type.
In other types of feed stream products, the system pushes differentiated content to users based on their preferences. This system architecture is different from the timeline-based architecture and will be described in a subsequent article about recommendations.
Delete Feed Content
A feed stream system must provide a method to process the content that is published and then deleted by a user. If the push mode is used during the delete operation, it is not possible to promptly delete the content as required by laws and regulations.
The synchronization table includes only message IDs instead of message content. The message content is read from the repository when users read messages. Users may not be able to read any data based on the message ID by directly deleting a message from the repository. Tthis is equivalent to deleting content, in a quick manner. The message content may also be deleted through tombstones, in addition to direct deletion. Deleted feed content is labeled. When the system queries labeled data, the system regards such data as having been deleted.
Update Feed Content
The logic used to update feed content is similar to the one used to delete feed content. If you use a storage system with multi-version support, such as Table Store, you may edit versions, just as in the case of Weibo.
The above sections describe the features and system requirements of different sub-functions. Two types of systems meet requirements: a single system based on Alibaba Cloud Tablestore and a combined system based on open-source components.
- Combined systems based on open-source components include MySQL, Redis, and HBase. These individual systems must be combined to form a feed stream system. Apply this type of system to large teams that focus on open-source system development and O&M.
- A single system based on Table Store also forms a feed stream system. Tablestore is specially designed for feed stream systems and has been applied to feed stream systems for three years. Apply your Tablestore database to the following scenarios of feed stream system development:
- The product is intended for a large user base, reaching tens of or hundreds of millions.
- Emphasis is on development rather than O&M.
- The team hopes to implement the product efficiently and quickly.
- The system automatically scales out to serve more users.
- Users are billed in pay-as-you-go mode, with the tariff in proportion to the number of users.
Architecture Case Study
The design focus of a feed stream system varies depending on different product types. Let’s take a quick look at the various product types.
WeChat Moments is a typical feed stream system that uses the push mode and has a limited number of two-way write relationships. Feed content is sorted by time. Users who keep producing useless content are added to a blacklist.
The design of feed stream systems similar to WeChat Moments is detailed in the article “System Architecture Design for WeChat Moments and Similar Systems.”
Weibo is a typical feed stream system with one-way relationships and VIP accounts, so it requires both push and pull modes. On Weibo, users actively follow other users, so feed content is sorted by time. Sorting by recommendation is relatively ineffective.
The design of feed stream systems similar to Weibo is detailed in the article “System Architecture Design for Weibo and Similar Systems.”
TopBuzz is an app that evolves from the feed stream system of Weibo and has quickly gained popularity in recent years. The system pushes content to users based on their preferences that are determined based on user-browsed content. The pushed content approximates user preferences after a period of training. Users do not need to follow other users to receive content pushes.
The design of feed stream systems similar to TopBuzz is detailed in the article “System Architecture Design for Toutiao and Similar Systems.”
Private messages may refer to as a simple feed stream system or a variant of instant messaging (IM). Such messages use one-way relationships and do not support group chats.
This article describes the general framework of a feed stream system, including product definitions, synchronization, storage, metadata, comments, likes, sorting, and search features. We hope that this article has helped you to design a feed stream system with hundreds of millions of users.