A Small Victory — Acceleration of BITFIELD Commands for ApsaraDB for Redis

1) Problems

An Alibaba Cloud customer discovers that when they used read/write splitting instances, the CPU utilization of the main node was high while the standby nodes that carry the read traffic in read/write splitting was relatively idle. When the CPU is at full capacity, the online services on the main node are getting affected significantly.

1.1) Principles of Read/Write Splitting

The principles ApsaraDB for Redis follows in read/write splitting instances are as follows:

  • User requests are judged by the proxy.
  • Write requests are forwarded to the main node while read requests are distributed to the standby nodes.
Figure 1: Example of forwarding read and write commands for ApsaraDB for Redis read/write splitting

1.2) BITFIELD Commands

After interacting with the customer, we found the customer used a large number of BITFIELD commands to read data. BITFIELD commands are run for bitmap data types. A bitmap is usually used to determine the status with minimal space consumption based on bitwise operations (AND, OR, XOR, and NOT). Common scenarios include:

  • Determine whether users have read the same articles or watched the same videos.
  • Use bitmaps to easily determine whether a user correctly answered all the questions in a live Q&A activity.
Figure 2: A live Q&A system designed by using Redis bitmaps
[GET type offset] // Obtain the value of the specified bit
[SET type offset value] // Set the value for the specified bit
[INCRBY type offset increment] // Increase the value of the specified bit
[OVERFLOW WRAP|SAT|FAIL] // Control the INCR threshold

1.3) BITFIELD Problems in Read/Write Splitting Instances

As mentioned above, in BITFIELD sub-commands, the GET command is a read command while the SET and INCRBY commands are write commands. Therefore, ApsaraDB for Redis classifies the BITFIELD command as a write command, so these commands can only be forwarded to the main node. The following figure shows the BITFIELD command routing.

2) Ideas and Problem Resolution

2.1) Solutions

  • Solution 1: Mark the BITFIELD command as a read attribute in the ApsaraDB for Redis kernel. When it contains sub-commands of the write attribute, such as SET and INCRBY, the BITFIELD command is synchronized to the standby nodes. When you use this solution, you do not need to modify the external components (proxy and client) but do need to specially process the BITFIELD command, which destroys the consistency of the unified processing of engine commands.
  • Solution 2: Add the BITFIELD_RO command, which is similar to the GEORADIUS_RO command, to only support GET commands. Because all these commands are read operations, this ensures that the standby nodes can process BITFIELD commands. This solution is clear and reliable but requires the adaptation of the proxy and client.

2.2) Adding BITFIELD_RO

"read-only fast @bitmap",
tair-redis > SLAVEOF 6379
tair-redis > set k v
(error) READONLY You can't write against a read only replica.
tair-redis > BITFIELD mykey GET u4 0
(error) READONLY You can't write against a read only replica.
tair-redis > BITFIELD_RO mykey GET u4 0
1) (integer) 0

2.3) Forwarding the Proxy

To free users from having to modify the code, we have implemented compatibility with BITFIELD commands on the proxy. Therefore, if your BITFIELD command only contains the GET sub-command, the proxy converts the command to BITFIELD_RO and distributes it to multiple backend nodes to accelerate delivery, as shown in Figure 4.

Figure 4: BITFIELD processing logic after the BITFIELD_RO command is added

2.4) Contribution to the Community

We contributed our modification to the Redis community, which officially accepted it.

Figure 5: Contribution Rankings for Redis 6.0 RC

3) Extension and Discussion

3.1) Summary

ApsaraDB for Redis introduces the BITFIELD_RO command to solve the problem where official BITFIELD GET sub-commands could not be accelerated on the standby nodes.

3.2) Discussion Questions

Why do we need read/write splitting? Why can’t we just use the cluster edition? To answer these questions, assume the service capability of the community version is K. The following table shows a comparison of different versions. Here, we only compare ApsaraDB for Redis Enhanced Edition (Tair). For the cluster edition, the service capability can be multiplied by the number of shards.

  • A big key is very likely a hot spot.
  • If you accidentally perform range operations on a big key, slow queries may occur and the bandwidth is likely to burst.

Original Source:



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store