Unveiling the Secrets Behind Alibaba’s Full-scale Stress Testing for Double 11

  • Preparation of the stress testing environment: A real-world online environment must be reused so that the stress testing results and problem exposure reflect the most realistic situations. Stress testing data can be identified and passed-through globally.
  • Basic data preparation: By taking an e-commerce scenario as an example, construct core basic data, such as buyer, seller, and commodity information, which meets the needs of the promotion scenario. Online data is used as the data source and kept at the same magnitude to perform sampling, filtering, and data masking.

Process and Management

Reconstruction of the Stress Testing Environment

Business Reconstruction

  • Traffic differentiation and identification: distinguishes stress testing traffic from business traffic, and allows both of them to be identified in the full-link system.
  • The singleness of traffic: For example, if a user places an order and repeats it, the latter action fails.
  • Traffic limiting and interception: If traffic limiting is required, inbound traffic degradation must be configured to enable configuration adjustment in real time.
  • Remove the impact of stress testing data on statistics.
  • Dynamic verification.

Middleware Transformation

Data Preparation

Business Model Data

  • Existing business scenarios: Historical data is collected and processed to generate prediction data that is used to form a prototype model. Combined with new business features, the prototype mode is then used to construct a model for existing businesses.
  • New business scenarios: A new business model is directly constructed pro rata based on new businesses.
  • 1 to N: determines whether multiples calls occur when an upstream business request corresponds to downstream business interfaces.
  • Business proportion: calculates the proportion of different types of businesses based on historical data.

Basic Data of Stress Testing

Traffic Security Policy

  • Method: shadow table data. The shadow table is a writable stress testing data table that has the same structure as the online table but is in an isolated location.
  • Result: Data is isolated to avoid data errors.
  • Method: connect relevant security policies to the throttling and degradation function, slightly lift security policies for stress testing, or identify stress testing traffic by using a special identifier.
  • Result: Stress testing traffic is not identified as attack traffic. The stress test runs successfully, while the security of online businesses is guaranteed.

Stress Testing Implementation

Problem Identification and Analysis

Intelligent Stress Testing

  • More supported protocols
  • Capacity evaluation
  • Automatic problem detection
  • Full-link function testing and stress testing rehearsal
  • Stress testing normalization
  • Elastic promotion with scaling and stress testing in parallel

Future Vision

Original Source

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store