In-depth Analysis of Redis Cluster Gossip Protocol

Protocol Analysis

The cluster gossip protocol definitions are in ClusterMsg, this structure, and the source code is as follows:

typedef struct {
char sig[4]; /* Signature "RCmb" (Redis Cluster message bus). */
uint32_t totlen; /* Total length of this message */
uint16_t ver; /* Protocol version, currently set to 1. */
uint16_t port; /* TCP base port number. */
uint16_t type; /* Message type */
uint16_t count; /* Only used for some kind of messages. */
uint64_t currentEpoch; /* The epoch accordingly to the sending node. */
uint64_t configEpoch; /* The config epoch if it's a master, or the last
epoch advertised by its master if it is a
slave. */
uint64_t offset; /* Master replication offset if node is a master or
processed replication offset if node is a slave. */
char sender[CLUSTER_NAMELEN]; /* Name of the sender node */
unsigned char myslots[CLUSTER_SLOTS/8];
char slaveof[CLUSTER_NAMELEN];
char myip[NET_IP_STR_LEN]; /* Sender IP, if not all zeroed. */
char notused1[34]; /* 34 bytes reserved for future usage. */
uint16_t cport; /* Sender TCP cluster bus port */
uint16_t flags; /* Sender node flags */
unsigned char state; /* Cluster state from the POV of the sender */
unsigned char mflags[3]; /* Message flags: CLUSTERMSG_FLAG[012]_... */
union clusterMsgData data;
} clusterMsg;
  • nodename:
  • ping_sent: The time point when the more recent sender node sends a ping to said node After the pong response is received, ping_sent will be an assignment of 0
  • pong_received: The time point when then most recent sender node receives a pong sent by said node
  • ip:
  • port:
  • cport:
  • flags: The flags of the corresponding clusterMsg, which only store other nodes
  • configEpoch: the sender node configepoch that is saved in the receiver node
  • nodename: the sender node nodename that is saved in the receiver node
  • slots: the sender node slots list that is saved in the receiver node

Operative Mechanism

The cluster can provide important cluster functions through the gossip protocol, such as synchronization updates of the state between clusters, election self-service failover, etc.

Handshake Coupling

After the client sends the cluster meet node Y to node X request, it will try to establish a connection between the master-slave and node Y after node X. At this time, the state of node Y that is saved in node X is:

  • CLUSTER_NODE_HANDSHAKE: This indicates that node Y is in a handshake state, and this state will only be cleared after it receives one kind of message out of ping, pong and meet coming from node Y
  • CLUSTER_NODE_MEET: This indicates that a meet message has still not been sent to node Y, and this state is cleared once it has been sent, irrespective of whether or not it is successful
  • The node names are the same:
  • The node names are different:

Health Detection and Failover

For the specifics, refer to the article “Understanding the Failover Mechanism of Redis Cluster”.

Status Update and Conflict Resolution

When two masters appear, how does the gossip protocol handle the conflict?

  • configEpoch: Each fragment has a unique epoch value, and the master and slave epochs should match
  • currentEpoch: The current epoch of the cluster, = the epoch of the maximum fragment in the cluster
  • If the sender believes itself to be the master, but the receiver is marked as the slave, the receiver node marks the sender as the master in the cluster view.
  • If the sender believes itself to be the slave, but the receiver is marked as the master, it marks the sender as the slave in the cluster view of the receiver, and then adds this to the master marked by the sender, and in addition deletes the slots information of the sender in the receiver cluster view.

Concluding Remarks

ApsaraDB for Redis is a stable, reliable and scalable database service with superb performance. It is structured on Apsara Distributed File System and full SSD high-performance storage, and supports master-slave and cluster-based high-availability architectures. It offers a full range of database solutions including disaster recovery switchover, failover, online expansion, and performance optimization. We welcome everyone to purchase and use ApsaraDB for Redis.

Reference

https://www.alibabacloud.com/blog/in-depth-analysis-of-redis-cluster-gossip-protocol_594706?spm=a2c41.12785329.0.0

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

4.97K Followers

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com