MongoShake — A MongoDB-based Cross-Data Center Data Replication Platform

Overview of MongoShake

Application Scenario of MongoShake

  1. Asynchronous data replication among MongoDB clusters to avoid the double-write business overhead.
  2. Mirror backup for data among MongoDB clusters (support for the 1.0 open source version is currently restricted)
  3. Offline log analysis
  4. Log subscription
  5. Data routing. You can get the data that you’ve followed based on your business needs by using MongoShake’s log subscription and filtering mechanism; this is a typical application of MongoShake’s data routing function.
  6. Cache synchronization. Based on log analysis results, you will be able to know which caches can be eliminated and which can be preloaded. In this way, MongoShake facilitates cache update.
  7. Log-based cluster monitoring

MongoShake Features

  1. Direct: Directly writing into the target MongoDB
  2. RPC: Connection through the net/rpc method
  3. TCP: Connection through the tcp method
  4. File: Connection through the file method
  5. Kafka: Connection through the Kafka method
  6. Mock: Used for testing and is not written into any tunnel; this discards all data

Parallel Replication

HA solution

Filtering

Compression

Gid

Checkpoint

Troubleshooting and Traffic Restriction

Conflict Detection

  1. MongoShake believes the synchronized MongoDB Schema is consistent, and does not listen to modifications to the Oplog’s System.indexes table.
  2. In the case of index conflict, the index recorded in the Oplog will be used and the index in the current MongoDB will not be referenced.
  1. An index is being created. If an index is created in the backend, this index is not visible to the write requests during this period, but it is visible to internal requests. This may also cause high memory usage. If an index is created in the front-end, then all user requests will be blocked. If the blocking time is too long, re-transmission will be triggered.
  2. If the target database has a unique index that does not exist in the source database, it will cause data inconsistency and the data will not be processed.
  3. If a unique index is added or deleted in the source database after an oplog is generated, then retransmission may cause index addition/deletion problems; this cannot be solved by us.
{
"ts" : Timestamp(1484805725, 2),
"t" : NumberLong(3),
"h" : NumberLong("-6270930433887838315"),
"v" : 2,
"op" : "u",
"ns" : "benchmark.sbtest10",
"o" : { "_id" : 1, "uid" : 1111, "other.sid":"22222", "mid":8907298448, "bid":123 }
"o2" : {"_id" : 1}
"uk" : {
"uid": "1110"
"mid^bid": [8907298448, 123]
"other.sid_1": "22221"
}
}
  1. If M and N operate on the same value with the same unique index, and M is smaller than N, then a directed edge from M to N is created.
  2. If M and N have the same file ID, and M is smaller than N, then a directed edge from M to N is created.
  3. Because the dependencies have a time order, there is no loop.

Architecture and Dataflow

Customer Case Study: Amap

Concluding Remarks

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Introducing Qwiklabs Courses!

The Pragmatic Programmer EP.3 — A Pragmatic Practice Part II

The Answer to your SQL Question

The Bash Trap Trap

Parsing JSON stream in Haskell

How I keep track of my Expenses as a Software Engineer

VHDL The Language Of Integrated Circuits.

Testing User Interface of Android App with Architecture Components, Data Binding and Dagger 2

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

Bucket Aggregation — ElasticSearch

Elasticsearch In A Nutshell Part-3

Redis key expiry

Elasticsearch — How to Optimise Index and Search Performance