5-year Evolution of Ele.me’s Transaction System — Part 3

Creation of the Test Team

The transaction team had never had any full-time test developers. All our work was self-tested by R&D engineers. At that time, the company did not have strong automated testing capabilities, and almost all the tests were manual. However, I thought test resources were essential at that moment. I worked hard to have a test team be set up to protect the release quality of the order system.

WeBot

In addition to developing test process specifications and standards, I started to build a platform to manage test cases, test statuses, and test reports.

Performance Testing

The second thing I did was to build a performance test.

Random Fault Drills

The third thing I did was to organize random fault drills.

Version 1.0 of the Random Fault Drill

At first, the random fault drills were actually very simple. The general idea was as follows:

  1. Build a client and simulate user behavior to create data. (Our experiences grained from automated integration testing really came in handy here.)
  2. Provide a tool to build a Mock Server of the dependent service to resolve the long-chain service dependency. The Mock Server can return some preset output according to the input.
  3. Tag the traffic according to the client. This feature was enabled by a special version released by the framework team. Based on traffic tags, the Mock Server can simulate abnormal behaviors, such as blocking and timeout, and send feedback to the tested server.

Version 2.0 of the Random Fault Drill

JN gathered some colleagues to build a wheel based on the prototype of Netflix’s Choas Monkey. We named the wheel “Kennel.”

  1. An API exception causes a whole service to crash.
  2. When a node or machine in a cluster restarts, API callers are severely affected.
  3. The CPU load on a node in a cluster increases, causing imbalanced load distribution in the cluster.
  4. A service takes effect for a single server in a cluster, causing behavior inconsistencies between the servers in the cluster.

Original Source:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store