Implementation of Message Push and Storage Architectures of Modern IM Systems

Preface

  • Message synchronization: Transmitting integrate messages from the sender to the recipient quickly. The most important metrics of a message synchronization system are the instantaneity and integrity of transmitted messages, and the size of messages that can be supported. In terms of functionality, an IM system must at least support online and offline message push. Advanced IM systems also support “multi-terminal synchronization”.
  • Message storage: The persistent storage of messages. This does not mean the local storage of messages at the client-side, but the storage on the cloud. This is the so-called “message roaming” function. The advantage of “message roaming” is that you can log on to your account at any terminals to view all historical messages. This is also one of the unique features of the advanced IM system.

Architecture Design

Comparison between Conventional and Modern Architectures

Timeline Model

  • Each message has a sequence ID (SeqId), and the SeqId of a message in the rear part of a queue is always greater than the SeqId of a message in the front part of the queue. This ensures that the SeqId increments over time, but it does not have to be monotonically increasing.
  • New messages are always added to the end of a queue, ensuring that the SeqId of the new message is always greater than that of existing messages.
  • We can either read a specific message based on the SeqId, or read all messages within a given range.

Message Storage Model

Message Synchronization Model

  • Pull mode: In the message storage model, all messages of a session are saved in the timeline of the session. In pull mode message synchronization, new messages generated in each session only need to be written once to the storage timeline to allow the receiving terminal to pull such messages from the timeline. The advantage is that a message only needs to be written once. This greatly reduces the number of message writes in comparison with push mode, especially in the case of group chat messages. However, its drawback is also obvious — the logic for a receiving terminal to pull messages could be relatively complicated and inefficient. The receiving terminal must pull messages from every session to get all messages. The reads are amplified. This also introduces a lot of invalid reads because not every session has new messages.
  • Push mode: In push mode, an additional timeline is required for message synchronization for each session. Usually, each receiving terminal has an independent synchronization timeline to store all messages that need to be synchronized to this terminal. In this case, messages in each session are written to both the message storage and message synchronization timelines. In a one-to-one chat scenario, a message is additionally written twice. Apart from being written to the storage timeline of the session, the message must be written to the message synchronization timeline of both recipients. In the group chat scenario, the writes are amplified further. If a group has N participants, each message must be written N+1 times. The push mode message synchronization has an outstanding advantage in that the synchronization logic at the receiving terminal is very simple. The receiving terminal only needs to pull messages from the synchronization timeline once. This greatly reduces the reading pressure for message synchronization. The drawback is that message writes are amplified, especially for group chats.

Message Database Design

  • Message synchronization database: The message synchronization database is used to store all message synchronization timelines. Each timeline corresponds to one receiving terminal, and is mainly used for the push mode message synchronization. This database does not have to permanently keep all messages that need to be synchronized. Because when a message is synchronized to all terminals, its lifecycle ends, and it can be deleted immediately. However, as mentioned before, a simple multi-terminal message synchronization system does not store the synchronization status of all receiving terminals on the server. The synchronization is proactively done by the terminals. In this case, the server does not know when a message can be deleted. The common practice is to set a fixed lifecycle for messages stored in this database. For example, one week or one month. A message is deleted when its lifecycle ends.
  • Message storage database: The message storage database stores timelines of all sessions, and each timeline holds all messages of a session. This database is mainly used to pull all historical messages of a session during message roaming. It can also be used in the pull mode message synchronization.

Database Selection

Architecture Implementation

Postscript

  1. How does Table Store ensure high reliability and high availability
  2. How does Table Store implement cross-region disaster tolerance

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Guide to Business Intelligence Scheduled Reports (Types, Issues, and Best Practices)

DashboardFox - Report Scheduling BI Platform

AWS Summit Sydney, 2017

Guide: Measuring Your Speed Improvement With The BDN

Mutate This! Or Not… A Tale of Python Objects

Fueling cloud native adoption in an optimal way

COMPILING THE ANDROID WEBRTC LIBRARY

Log on to Alibaba Cloud Using Internal Enterprise Accounts with RAM Single Sign-On

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

Using Search Template — ElasticSearch

Serverless Diary: How to Design Fail-Fast Architectures using Circuit Breaker

Deploying Open Source Observability Stack in AWS — Part — II

Kusk Gateway Alpha 2 Release — OpenAPI Driven Kubernetes Gateway — Kubeshop