How We Developed DingTalk: Implementing the Message System Architecture

Function Modules

The function modules include message storage, relationship maintenance, instant awareness, and multi-client synchronization. Let’s quickly take a closer look at each function.

Message Storage

Message storage is the basic function of a message system. Message storage supports message read, write, and persistence to prevent message loss and ensure fast and efficient queries. In the IM scenario, data is written according to rows in batches and is read within message queues. Fuzzy search for historical messages is implemented through multidimensional retrieval and full-text retrieval.

Repository

Function: Message Display in Session Windows

The repository is a table that stores messages from chat sessions according to session type. Each session refers to a message queue. A single message queue, also called TimelineQueue, is uniquely identified by a TimelineID, and all messages are ordered by SequenceID.

public List<AppMessage> fetchConversationMessage(String timelineId, long sequenceId) {
TimelineStore store = timelineV2.getTimelineStoreTableInstance();
TimelineIdentifier identifier = new TimelineIdentifier.Builder()
.addField("timeline_id", timelineId)
.build();
ScanParameter parameter = new ScanParameter()
.scanBackward(sequenceId)
.maxCount(30);
Iterator<TimelineEntry> iterator = store.createTimelineQueue(identifier).scan(parameter); List<AppMessage> appMessages = new LinkedList<AppMessage>();
while (iterator.hasNext() && counter++ <= 30) {
TimelineEntry timelineEntry = iterator.next();
AppMessage appMessage = new AppMessage(timelineId, timelineEntry);
appMessages.add(appMessage);
}
return appMessages;
}

Functions: Multidimensional Combination and Full-text Retrieval

The full-text retrieval capability supports message fuzzy query in the repository based on the search indexes of stored data. You can design search indexes based on your specific requirements. For example, enable the fuzzy query of messages in a DingTalk group by creating search indexes for the group ID, message sender, message type, message content, and time. The message content must pertain to the segmented string type.

public List<AppMessage> fetchConversationMessage(String timelineId, long sequenceId) {
TimelineStore store = timelineV2.getTimelineStoreTableInstance();
TimelineIdentifier identifier = new TimelineIdentifier.Builder()
.addField("timeline_id", timelineId)
.build();
ScanParameter parameter = new ScanParameter()
.scanBackward(sequenceId)
.maxCount(30);
Iterator<TimelineEntry> iterator = store.createTimelineQueue(identifier).scan(parameter); List<AppMessage> appMessages = new LinkedList<AppMessage>();
int counter = 0;
while (iterator.hasNext() && counter++ <= 30) {
TimelineEntry timelineEntry = iterator.next();
AppMessage appMessage = new AppMessage(timelineId, timelineEntry);
appMessages.add(appMessage);
}
return appMessages;
}

Synchronization Database

Function: Real-time Statistics on New Messages

The application system service can detect the online state of a client by maintaining a persistent connection with the client. The application will send a signal to notify the client when a new message is being written to the synchronization database of the user. Then, the client will pull all new messages after the SequenceID from the synchronization database based on the checkpoint of the database, count the number of new messages in each session, and update the checkpoint.

public List<AppMessage> fetchSyncMessage(String userId, long lastSequenceId) {
TimelineStore sync = timelineV2.getTimelineSyncTableInstance();
TimelineIdentifier identifier = new TimelineIdentifier.Builder()
.addField("timeline_id", userId)
.build();
ScanParameter parameter = new ScanParameter()
.scanForward(lastSequenceId)
.maxCount(30);
Iterator<TimelineEntry> iterator = sync.createTimelineQueue(identifier).scan(parameter); List<AppMessage> appMessages = new LinkedList<AppMessage>();
int counter = 0;
while (iterator.hasNext() && counter++ <= 30) {
AppMessage appMessage = new AppMessage(userId, iterator.next());
appMessages.add(appMessage);
}
return appMessages;
}

Function: Asynchronous Write Divergence

This example includes writing the messages of a one-on-one chat session to the repository and synchronization database simultaneously, with a small overhead that occupies two lines. After writing the messages in a group chat to the repository, the system obtains the group member list and writes the messages of each member in sequence to the synchronization database. As a best pratice, only apply this method to scenarios with a small number of groups and users.

Metadata Management

Metadata is a type of data description that’s divided into user metadata and session metadata. Group metadata, including the group ID (that is, the TimelineID of the group), group name, and creation time, is based on the management table of TimelineMeta. Map all group-type TimelineMeta to a group. User metadata is based on a separate table because it does not reuse TimelineMeta.

User Metadata

User metadata describes user properties. A user ID identifies a specific user. User properties in a user relationship, such as gender, signature, and display picture, are maintained separately.

Function: User Retrieval

Just maintain a user table and create search indexes to allow users to add friends. The example does not show how to implement this function. Set different index fields as needed. Refer to the following requirement analysis:

  • QR code (including the user_id): The primary key query
  • User name: The search index, segmented string setting for the user name field
  • User tag: The search index, tag retrieval for array string indexes, multi-tag score retrieval and sorting for nested indexes
  • People nearby: The search index, a query of people nearby and people in specific geo-fences through GEO

Session Metadata

Session metadata is the same as session properties, including the session category (which may be a group chat, one-on-one chat, or official account), group name, announcement, and creation time. A unique ID is an ID that identifies a session. Fuzzy group search by group name is an important capability required by session metadata.

Function: Group Retrieval

A user who wants to join a group must first search for the group. Implement a group search in the same way as a user search and customize different index fields as needed.

  • QR code (including the user_id): The primary key query
  • Group name: The search index, segmented string setting for the user name field
  • Group tag: The search index, tag retrieval for array string indexes, multi-tag score retrieval and sorting for nested indexes

Relationship Maintenance

You can implement the functions of adding friends and joining group chats after you have managed metadata and retrieving users and groups. This part is related to an important function of the IM system, maintaining relationships such as interpersonal relationships, user-group relationships, and user-session relationships. The following sections describe how to maintain these relationships based on Alibaba Cloud Tablestore.

One-on-one Chat Relationship

Function: Relationship Between Users and One-on-One Chat Sessions

A single-chat session has two participants who are sorted in no particular order. Only one session initiates for contact from A to B or from B to A. The following design is recommended for maintaining the relationship between users and single-chat sessions based on Tablestore.

Function: Interpersonal Relationship

The interpersonal relationship is easy to implement based on the im_user_relation_table table. For example, you can check whether A and B are friends through a single-row query. If they are not friends, the Add Friend button is available. When A adds B as a friend or vice versa, the system writes two rows with different primary key orders and creates a unique TimelineID.

public void establishFriendship(String userA, String userB, String timelineId) {
PrimaryKey primaryKeyA = PrimaryKeyBuilder.createPrimaryKeyBuilder()
.addPrimaryKeyColumn("main_user", PrimaryKeyValue.fromString(userA))
.addPrimaryKeyColumn("sub_user", PrimaryKeyValue.fromString(userB))
.build();
RowPutChange rowPutChangeA = new RowPutChange(userRelationTable, primaryKeyA);
rowPutChangeA.addColumn("timeline_id", ColumnValue.fromString(timelineId));
PrimaryKey primaryKeyB = PrimaryKeyBuilder.createPrimaryKeyBuilder()
.addPrimaryKeyColumn("main_user", PrimaryKeyValue.fromString(userB))
.addPrimaryKeyColumn("sub_user", PrimaryKeyValue.fromString(userA))
.build();
RowPutChange rowPutChangeB = new RowPutChange(userRelationTable, primaryKeyB);
rowPutChangeB.addColumn("timeline_id", ColumnValue.fromString(timelineId));
BatchWriteRowRequest request = new BatchWriteRowRequest();
request.addRowChange(rowPutChangeA);
request.addRowChange(rowPutChangeB);
syncClient.batchWriteRow(request);
}
public void breakupFriendship(String userA, String userB) {
PrimaryKey primaryKeyA = PrimaryKeyBuilder.createPrimaryKeyBuilder()
.addPrimaryKeyColumn("main_user", PrimaryKeyValue.fromString(userA))
.addPrimaryKeyColumn("sub_user", PrimaryKeyValue.fromString(userB))
.build();
RowDeleteChange rowPutChangeA = new RowDeleteChange(userRelationTable, primaryKeyA); PrimaryKey primaryKeyB = PrimaryKeyBuilder.createPrimaryKeyBuilder()
.addPrimaryKeyColumn("main_user", PrimaryKeyValue.fromString(userB))
.addPrimaryKeyColumn("sub_user", PrimaryKeyValue.fromString(userA))
.build();
RowDeleteChange rowPutChangeB = new RowDeleteChange(userRelationTable, primaryKeyB); BatchWriteRowRequest request = new BatchWriteRowRequest();
request.addRowChange(rowPutChangeA);
request.addRowChange(rowPutChangeB);
syncClient.batchWriteRow(request);
}

Group Chat Relationship

Function: Relationship Between Group Chat Sessions and Group Members

The frequent query operation during a group chat helps to obtain the list of the current group members. After obtaining the group member list, the application displays the group properties and queries the recipient list for write divergence.

public List<Conversation> listMySingleConversations(String userId) {
PrimaryKey start = PrimaryKeyBuilder.createPrimaryKeyBuilder()
.addPrimaryKeyColumn("main_user", PrimaryKeyValue.fromString(userId))
.addPrimaryKeyColumn("sub_user", PrimaryKeyValue.INF_MIN)
.build();
PrimaryKey end = PrimaryKeyBuilder.createPrimaryKeyBuilder()
.addPrimaryKeyColumn("main_user", PrimaryKeyValue.fromString(userId))
.addPrimaryKeyColumn("sub_user", PrimaryKeyValue.INF_MAX)
.build();
RangeRowQueryCriteria criteria = new RangeRowQueryCriteria(userRelationTable);
criteria.setInclusiveStartPrimaryKey(start);
criteria.setExclusiveEndPrimaryKey(end);
criteria.setMaxVersions(1);
criteria.setLimit(100);
criteria.setDirection(Direction.FORWARD);
criteria.addColumnsToGet(new String[] {"timeline_id"});
GetRangeRequest request = new GetRangeRequest(criteria);
GetRangeResponse response = syncClient.getRange(request);
List<Conversation> singleConversations = new ArrayList<Conversation>(response.getRows().size()); for (Row row : response.getRows()) {
String timelineId = row.getColumn("timeline_id").get(0).getValue().asString();
String subUserId = row.getPrimaryKey().getPrimaryKeyColumn("sub_user").getValue().asString();
User friend = describeUser(subUserId);
Conversation conversation = new Conversation(timelineId, friend); singleConversations.add(conversation);
}
return singleConversations;
}
public List<Conversation> listMyGroupConversations(String userId) {
PrimaryKey start = PrimaryKeyBuilder.createPrimaryKeyBuilder()
.addPrimaryKeyColumn("user_id", PrimaryKeyValue.fromString(userId))
.addPrimaryKeyColumn("group_id", PrimaryKeyValue.INF_MIN)
.build();
PrimaryKey end = PrimaryKeyBuilder.createPrimaryKeyBuilder()
.addPrimaryKeyColumn("user_id", PrimaryKeyValue.fromString(userId))
.addPrimaryKeyColumn("group_id", PrimaryKeyValue.INF_MAX)
.build();
RangeRowQueryCriteria criteria = new RangeRowQueryCriteria(groupRelationGlobalIndex);
criteria.setInclusiveStartPrimaryKey(start);
criteria.setExclusiveEndPrimaryKey(end);
criteria.setMaxVersions(1);
criteria.setLimit(100);
criteria.setDirection(Direction.FORWARD);
criteria.addColumnsToGet(new String[] {"group_id"});
GetRangeRequest request = new GetRangeRequest(criteria);
GetRangeResponse response = syncClient.getRange(request);
List<Conversation> groupConversations = new ArrayList<Conversation>(response.getRows().size()); for (Row row : response.getRows()) {
String timelineId = row.getPrimaryKey().getPrimaryKeyColumn("group_id").getValue().asString();
Group group = describeGroup(timelineId);
Conversation conversation = new Conversation(timelineId, group); groupConversations.add(conversation);
}
return groupConversations;
}

Instant Awareness

Session Pool Solution

The instant awareness of new messages is the core of IM scenarios. It allows clients to promptly perceive new messages and pull the latest messages from the synchronization database after receiving reminders. In this way, users can read new messages promptly. However, as a part of this, we need to solve the problem of how to promptly notify recipients of new messages.

More Information

Multi-client Synchronization

The above functions meet the basic requirements of an IM system. But, before we get too ahead of ourselves, it’s important to note the following two points about multi-client data synchronization.

  1. In the multi-client scenario, when a user sends a new message from a client, other clients do not perceive or refresh their sent messages if they have no other new messages. So, this challenge must be solved for multi-client synchronization. One simple solution is to write the sent message to the corresponding synchronization database. When counting unread messages, the client does not count its messages but updates the messages to the latest message digest. Therefore, this solution also solves the preceding problem in multi-client synchronization.

Add Friends and Apply for Joining Groups

You need to review the requests for adding friends or joining groups before serving such requests. Therefore, as you can see, only the reviewer is authorized to initiate relationship creation.

Practice

This article has described the functions of the IM system and how to implement these functions based on Tablestore. The complete sample code is open-source. Therefore, you may go through the code along with this article.

  • Obtain the AccessKey ID and AccessKey secret
  • Set the sample profile
  • The instance supports the secondary index

Open-source Code URL

The functions of the IM system may implement on the basis of Tablestore. The complete sample code is open-source. You can find it here.

Sample Configuration

Create the tablestoreCong.json file in the home directory and set parameters as follows:

# mac 或 linux系统下:/home/userhome/tablestoreCong.json
# windows系统下: C:\Documents and Settings\%用户名%\tablestoreCong.json
{
"endpoint": "http://instanceName.cn-hangzhou.ots.aliyuncs.com",
"accessId": "***********",
"accessKey": "***********************",
"instanceName": "instanceName"
}

Sample Portal

The sample provides three portals, which must execute in sequence. Be sure to release resources after using the portals to avoid unnecessary fees.

Original Source:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store