Facebook LinkBench Tests PostgreSQL Social Relation Profile Scenario Performance

By Digoal

Background

LinkBench is an open-source database benchmark developed by Google to evaluate database performance. LinkBench creates a set of test data around a social graph and then perform data operations such as querying, adding or disconnecting relationships.

For more information about LinkBench, refer to the following articles:

https://www.facebook.com/notes/facebook-engineering/linkbench-a-database-benchmark-for-the-social-graph/10151391496443920
http://www.oschina.net/translate/linkbench-a-database-benchmark-for-the-social-graph

The test model of LinkBench is very typical in the social relationships of users. I will provide a copy of PostgreSQL database performance test data under this model and the test method.

After seeing this test data, you can compare it with the test data of other databases to see the performance differences.

Test Model of LinkBench

This data can be represented in a social graph, where objects (graph nodes) such as people, posts, comments, and pages are connected by associations (directed edges of the graph) that model different relationships between the nodes.

Different types of associations can represent friendship between two users, a user liking another object, ownership of a post, or any other relationship.

LinkBench is a graph-serving benchmark, not a graph-processing benchmark — the difference being that the former simulates the transactional workload from an interactive social network service while the latter simulates an analytics workload.

This benchmark is not about finding graph communities or graph partitioning, but rather serving real-time queries and updates on a graph database.

For example, a general form of graph query would be to find all the edges of type A from node X into which update operations can insert, delete, or update graph nodes or edges.

An example graph update operation is “insert a friendship edge from user 4 to user 63459821.”

By classifying database queries into a small number of core operations for associations (edges) and objects (nodes), we could break down and analyze the mix of social graph operations for a production database.

The simple graph retrieval and update operations listed below are used to store and retrieve social graph data.

Note that the workload is heavy on edge operations and reads, particularly edge range scans.

Examples of edge range scans are “retrieve all comments for a post order from most to least recent” or “retrieve all friends for a user.”

Optimization Tips:
This question is about storing data. If comments of an article have been aggregated, a small number of blocks need to be scanned and the performance is good. Similarly, if friends of a user have been aggregated, retrieving friends of a user will not encounter performance problems.

Aggregation through clusters can also reduce rows to be scanned.

Introduction to LinkBench for PostgreSQL

The actual benchmark is driven by the LinkBench driver, a Java program that generates the social graph and the operation mix.

Originally this tool only supports MySQL. Currently this tool has extended support for PostgreSQL. However, make sure to use PostgreSQL 9.5 or later, because queries include UPSET(insert on conflict), a new feature in PostgreSQL 9.5.

https://github.com/mdcallag/linkbench

Install LinkBench

JDK

apache-maven

Set up the environment

Install LinkBench

Pack linkbench

Generate the environment variable configuration file

$ vi ~/.bash_profile

$ linkbench

The benchmark runs in two phases:

1. The load phase, where an initial graph is generated and loaded in bulk;

2. The request phase, where many request threads concurrently access the database with a mix of operations. During the request phase latency and throughput statistics for operations are collected and reported.

The exact behavior of the driver in both phases is controlled by a configuration file, and many aspects of the benchmark can be easily altered using this file.

The configuration file template is config/LinkConfigPgsql.properties.

PostgreSQL Deployment

This article does not include OS parameter optimization.

Environment variable configuration

Create a Test Database

1. Initialize the database

2. Create a database

3. Connect to linkdb and create tables and index

Configure the Load Template

Note that each value cannot have a trailing space in the configuration file of LinkBench. Otherwise parsing errors may occur.

Configure how much test data is to be imported

$ vi ~/app/linkbench/config/FBWorkload.properties

Configure the database connection method, report frequency, thread, operations tested on each thread, maximum test duration, and stress testing

$ vi ~/app/linkbench/config/LinkConfigPgsql.properties

Load Benchmark Data

Configure Benchmark Template

The steps are the same as those described in the preceding “Configure the load template” section

Stress Testing

Benchmark Result

Data load Result

Stress Test Result

Query performance for one-way metrics

p25 indicates that the RT for 0–25% of requests is between a and b (milliseconds) and p50 indicates that the RT for 25–50% of requests is between a and b (milliseconds)

Overall test statistics (including QPS)

The test result for a 32-core machine

Each item represents a test case and the last row indicates the overall performance. For more information, see the preceding figures.

The result shows that LinkBench reaches 120,000 QPS.

A 32-core host is used in this test example.

References

https://github.com/mdcallag/linkbench
https://www.facebook.com/notes/facebook-engineering/linkbench-a-database-benchmark-for-the-social-graph/10151391496443920

Original Source

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.