MongoDB Data Synchronization

MongoDB replica set (V3.0) synchronizes member status information through heartbeat information. Each node periodically sends heartbeat information, such as the replica set status information shown in the rs.status() method, to other members in the replica set.

The node initiating the heartbeat request is called the source, and the member receiving the heartbeat request is the target. A heartbeat request is divided into three phases.

  1. The source sends a heartbeat request to the target.
  2. The target handles the heartbeat request and sends a response to the source.
  3. The source receives a heartbeat response and updates the status of the target node.

Let us examine the main state synchronization logic in these three phases.

Phase 1

command: replSetHeartbeat database: admin metadata: { $replData: 1 } commandArgs: { replSetHeartbeat: "mongo-9552", pv: 1, v: 22, from: "", fromId: 3, checkEmpty: false }

Phase 2

If the node is not of the replica set mode, or the replica set name does not match, an error response will be returned. If the replica set version configured (the content of rs.conf()) for the source node is lower than that of the target node, the target node adds its own configuration to the heartbeat response message, and adds its own oplog and other status information to the heartbeat response message. If the target node is uninitialized, it immediately sends the heartbeat request to the source node to update its replica set configuration.

commandReply: { ok: 1.0, time: 1460705698, electionTime: new Date(6273289095791771649), e: true, rs: true, state: 1, v: 22, hbmsg: "", set: "mongo-9552", opTime: new Date(6272251740930703361) } metadata: { $replData: { term: -1, lastOpCommitted: { ts: Timestamp 1460372410000|1, t: -1 }, lastOpVisible: { ts: Timestamp 0|0, t: -1 }, configVersion: 22, primaryIndex: 2, syncSourceIndex: -1 } }

Phase 3

When an error response to the heartbeat request is received (a response timeout is also considered an error response), if the current number of retries is fewer than or equal to kMaxHeartbeatRetries (two by default), and the last heartbeat request was sent within kDefaultHeartbeatTimeoutPeriod (10 by default), the next heartbeat request will be sent immediately. When the number of retries exceeds kMaxHeartbeatRetries, or a period of kDefaultHeartbeatTimeoutPeriod has elapsed since the last heartbeat, the node is considered down. If the replica set version of the peer node is higher than that of the node itself, the configuration of the node will be updated and stored persistently in the local database, and the node will update the peer status information based on the response message. If the node itself is the master node, and it finds another node with a higher priority level has been elected as the master node, it takes the initiative to downgrade itself to a slave node. If the node itself is a slave node but finds that it has a higher priority level and is eligible to be elected as a master node, it will take the initiative to request the current master node to downgrade. (This logic still contains some bugs, so self-downgrading by the master node will take priority so as to ensure that the node with the highest priority can act as the master node). If there is no master node at the moment, the node will take the initiative to trigger an election. A new master node can then be elected after a majority of nodes agree with the election result.


However, there is no theoretical basis for the correctness of the process. In MongoDB 3.2, a new version of the replica set communication protocol is used and election is conducted through raft, which can further shorten the time for fault discovery and instance restoration.

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.