# Analysis on Group by and Order by Performance

• Using data to show how the number of scanned rows in a slow query log is obtained
• Finding a solution based on the principle of group by
• Checking implementation details of sorting
• Debugging source code with gdb

# Background

List the top 10 visited articles of this month and week respectively. The log table is as follows.

`CREATE TABLE `article_rank` (  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,  `aid` int(11) unsigned NOT NULL,  `pv` int(11) unsigned NOT NULL DEFAULT '1',  `day` int(11) NOT NULL COMMENT '日期 例如 20171016',  PRIMARY KEY (`id`),  KEY `idx_day_aid_pv` (`day`,`aid`,`pv`),  KEY `idx_aid_day_pv` (`aid`,`day`,`pv`)) ENGINE=InnoDB DEFAULT CHARSET=utf8`

# Preparations

To clearly verify the guesses, install a debugger for MySQL and enable slow query log collection to count scanned rows.

# Installation

• Compile the installation
• Create a MySQL user
• Initialize the database
• Initialize the MySQL configuration file
• Change the password

# Enable Slow Query Log

Edit the configuration file and add the following under the `[mysqld]` module:

`slow_query_log=1slow_query_log_file=xxxlong_query_time=0log_queries_not_using_indexes=1`

# Discovering the Problem

Assume that the following SQL statement is used to query the top 10 articles by views during the five days from `2018-12-20` to `2018-12-24`. Let's first use `explain` to see the analysis result.

`mysql> explain select aid,sum(pv) as num from article_rank where day>=20181220 and day<=20181224 group by aid order by num desc limit 10;+----+-------------+--------------+------------+-------+-------------------------------+----------------+---------+------+--------+----------+-----------------------------------------------------------+| id | select_type | table        | partitions | type  | possible_keys                 | key            | key_len | ref  | rows   | filtered | Extra                                                     |+----+-------------+--------------+------------+-------+-------------------------------+----------------+---------+------+--------+----------+-----------------------------------------------------------+|  1 | SIMPLE      | article_rank | NULL       | range | idx_day_aid_pv,idx_aid_day_pv | idx_day_aid_pv | 4       | NULL | 404607 |   100.00 | Using where; Using index; Using temporary; Using filesort |+----+-------------+--------------+------------+-------+-------------------------------+----------------+---------+------+--------+----------+-----------------------------------------------------------+`
`# Time: 2019-03-17T03:02:27.984091Z# User@Host: root[root] @ localhost []  Id:     6# Query_time: 56.959484  Lock_time: 0.000195 Rows_sent: 10  Rows_examined: 1337315SET timestamp=1552791747;select aid,sum(pv) as num from article_rank where day>=20181220 and day<=20181224 group by aid order by num desc limit 10;`

# Why Were 1,337,315 Rows Scanned?

We will query two data items: the number of matching rows and the number of rows after the `group by`operation.

`mysql> select count(*) from article_rank where day>=20181220 and day<=20181224;+----------+| count(*) |+----------+|   785102 |+----------+mysql> select count(distinct aid) from article_rank where day>=20181220 and day<=20181224;+---------------------+| count(distinct aid) |+---------------------+|              552203 |+---------------------+`

# Example Index

For ease of understanding, I first simulate a small portion of the data corresponding to the `idx_day_aid_pv`index according to the indexing rule.

# Viewing Optimizer Trace Information

`# 开启 optimizer_traceset optimizer_trace='enabled=on';# 执行 sql select aid,sum(pv) as num from article_rank where day>=20181220 and day<=20181224 group by aid order by num desc limit 10;# 查看 trace 信息select trace from `information_schema`.`optimizer_trace`\G;`
`{  "join_execution": {    "select#": 1,    "steps": [      {        "creating_tmp_table": {          "tmp_table_info": {            "table": "intermediate_tmp_table",            "row_length": 20,            "key_length": 4,            "unique_constraint": false,            "location": "memory (heap)",            "row_limit_estimate": 838860          }        }      },      {        "converting_tmp_table_to_ondisk": {          "cause": "memory_table_size_exceeded",          "tmp_table_info": {            "table": "intermediate_tmp_table",            "row_length": 20,            "key_length": 4,            "unique_constraint": false,            "location": "disk (InnoDB)",            "record_format": "fixed"          }        }      },      {        "filesort_information": [          {            "direction": "desc",            "table": "intermediate_tmp_table",            "field": "num"          }        ],        "filesort_priority_queue_optimization": {          "limit": 10,          "rows_estimate": 1057,          "row_size": 36,          "memory_available": 262144,          "chosen": true        },        "filesort_execution": [        ],        "filesort_summary": {          "rows": 11,          "examined_rows": 552203,          "number_of_tmp_files": 0,          "sort_buffer_size": 488,          "sort_mode": "<sort_key, additional_fields>"        }      }    ]  }}`

# Analyze Fields in the Temporary Table

After the debugging with gdb, we find that the fields in the temporary table are aid and num.

`Breakpoint 1, trace_tmp_table (trace=0x7eff94003088, table=0x7eff94937200) at /root/newdb/mysql-server/sql/sql_tmp_table.cc:2306warning: Source file is more recent than executable.2306      trace_tmp.add("row_length",table->s->reclength).(gdb) p table->s->reclength\$1 = 20(gdb) p table->s->fields\$2 = 2(gdb) p (*(table->field+0))->field_name\$3 = 0x7eff94010b0c "aid"(gdb) p (*(table->field+1))->field_name\$4 = 0x7eff94007518 "num"(gdb) p (*(table->field+0))->row_pack_length()\$5 = 4(gdb) p (*(table->field+1))->row_pack_length()\$6 = 15(gdb) p (*(table->field+0))->type()\$7 = MYSQL_TYPE_LONG(gdb) p (*(table->field+1))->type()\$8 = MYSQL_TYPE_NEWDECIMAL(gdb)`

# Process Overview

1. Try to use the temporary table in `memory` on the heap to store data processed by `group by` and find that the memory is insufficient.
2. Create a temporary table with two fields: `aid` and `num (sum(pv) as num)`.
3. Insert one row from the index `idx_day_aid_pv` into the temporary table. The insert operation follows this rule: If the `aid` field does not exist, insert the data row directly; if that field already exists, add the `pv`values to `num`.
4. Iterate all the rows in the index `idx_day_aid_pv` from `20181220` to `20181224` and perform Step 3.
5. Perform sorting by `num` in the temporary table in the precedence queue.
6. Extract the 10 data rows that are finally present in the heap (the heap in the precedence queue) and directly return them as a result set without looking for them again in the table.
1. Extract 10 data rows from the temporary table (unsorted) and use the num and aid values as the 10 elements of a Min heap (that is, the smallest num value is the value of the root).
2. Extract one row and compare its num value with the value of the heap root node. If the num value is larger than the value of the heap root node, replace it. Then perform heap sort on the new heap.
3. Repeat Step 2 until row 552,203 is compared.

# Solution 1: Use the idx_aid_day_pv index

`# Query_time: 4.406927  Lock_time: 0.000200 Rows_sent: 10  Rows_examined: 1337315SET timestamp=1552791804;select aid,sum(pv) as num from article_rank force index(idx_aid_day_pv) where day>=20181220 and day<=20181224 group by aid order by num desc limit 10;`

# Example Index

For ease of understanding, I also simulate a small portion of the data corresponding to the `idx_aid_day_pv`index according to the indexing rule.

`mysql> explain select aid,sum(pv) as num from article_rank force index(idx_day_aid_pv) where day>=20181220 and day<=20181224 group by aid order by null limit 10;+----+-------------+--------------+------------+-------+-------------------------------+----------------+---------+------+--------+----------+-------------------------------------------+| id | select_type | table        | partitions | type  | possible_keys                 | key            | key_len | ref  | rows   | filtered | Extra                                     |+----+-------------+--------------+------------+-------+-------------------------------+----------------+---------+------+--------+----------+-------------------------------------------+|  1 | SIMPLE      | article_rank | NULL       | range | idx_day_aid_pv,idx_aid_day_pv | idx_day_aid_pv | 4       | NULL | 404607 |   100.00 | Using where; Using index; Using temporary |+----+-------------+--------------+------------+-------+-------------------------------+----------------+---------+------+--------+----------+-------------------------------------------+`
`mysql> explain select aid,sum(pv) as num from article_rank force index(idx_aid_day_pv) where day>=20181220 and day<=20181224 group by aid order by null limit 10;+----+-------------+--------------+------------+-------+-------------------------------+----------------+---------+------+------+----------+--------------------------+| id | select_type | table        | partitions | type  | possible_keys                 | key            | key_len | ref  | rows | filtered | Extra                    |+----+-------------+--------------+------------+-------+-------------------------------+----------------+---------+------+------+----------+--------------------------+|  1 | SIMPLE      | article_rank | NULL       | index | idx_day_aid_pv,idx_aid_day_pv | idx_aid_day_pv | 12      | NULL |   10 |    11.11 | Using where; Using index |+----+-------------+--------------+------------+-------+-------------------------------+----------------+---------+------+------+----------+--------------------------+`
`# 开启optimizer_traceset optimizer_trace='enabled=on';# 执行 sql select aid,sum(pv) as num from article_rank force index(idx_aid_day_pv) where day>=20181220 and day<=20181224 group by aid order by num desc limit 10;# 查看 trace 信息select trace from `information_schema`.`optimizer_trace`\G;`
`{  "join_execution": {    "select#": 1,    "steps": [      {        "creating_tmp_table": {          "tmp_table_info": {            "table": "intermediate_tmp_table",            "row_length": 20,            "key_length": 0,            "unique_constraint": false,            "location": "memory (heap)",            "row_limit_estimate": 838860          }        }      },      {        "filesort_information": [          {            "direction": "desc",            "table": "intermediate_tmp_table",            "field": "num"          }        ],        "filesort_priority_queue_optimization": {          "limit": 10,          "rows_estimate": 552213,          "row_size": 24,          "memory_available": 262144,          "chosen": true        },        "filesort_execution": [        ],        "filesort_summary": {          "rows": 11,          "examined_rows": 552203,          "number_of_tmp_files": 0,          "sort_buffer_size": 352,          "sort_mode": "<sort_key, rowid>"        }      }    ]  }}`
1. Create a temporary table with two fields: `aid` and `num (sum(pv) as num)`.
2. Read a row in the `idx_aid_day_pv` and see if it matches the criterion. If the `day` field value is not in the range `(20181220 - 20181224)`, read the next row; if the `day` filed value is in the range, accumulate the pv values (this operation is not to be performed in the temporary table).
3. Read the next row in the `idx_aid_day_pv` index. If the `aid` value is consistent with that shown in Step 1 and meets the criterion, accumulate the pv values (this operation is not to be performed in the temporary table). If the `aid` value is not consistent, write the preceding result set into the temporary table.
4. Perform Step 2 and Step 3 cyclically until all the rows of the `idx_aid_day_pv` index are scanned.
5. Perform sorting by `num` in the temporary table in the precedence queue.
6. Obtain the records again in the table (the temporary table) based on the first 10 `rowids` and return the result set.
1. Extract 10 data rows from the temporary table (unsorted) and use the `num` and `rowid` values as the 10 elements of a Min heap (that is, the smallest num value is the value of the root).
2. Extract one row and compare its num value with the value of the heap root node. If the num value is larger than the value of the heap root node, replace it. Then perform heap sort on the new heap.
3. Repeat Step 2 until row 552,203 is compared.

# Solution Feasibility

I have found that if I add a row with the day value of `20181219`, the scanning index also reads that row, although it does not meet the criterion. Because in this test all the data is within the period of the five days from `20181220` to `20181224`, the scanned rows happen to be all the data rows in the table.

`drop procedure if exists idata;delimiter ;;create procedure idata()begin  declare i int;  declare aid int;  declare pv int;  declare post_day int;  set i=1;  while(i<=785102)do    set aid = round(rand()*500000);    set pv = round(rand()*100);    set post_day = 20181225 + i%5;    insert into article_rank (`aid`,`pv`,`day`) values(aid, pv, post_day);    set i=i+1;  end while;end;;delimiter ;call idata();# Query_time: 9.151270  Lock_time: 0.000508 Rows_sent: 10  Rows_examined: 2122417SET timestamp=1552889936;select aid,sum(pv) as num from article_rank force index(idx_aid_day_pv) where day>=20181220 and day<=20181224 group by aid order by num desc limit 10;`

# Solution 2: Increase the Maximum Space Limit of the Temporary Table

The default space of a temporary table is 16 MB.

`mysql> show global variables like '%table_size';+---------------------+----------+| Variable_name       | Value    |+---------------------+----------+| max_heap_table_size | 16777216 || tmp_table_size      | 16777216 |+---------------------+----------+`
`set tmp_table_size=33554432;set max_heap_table_size=33554432;# Query_time: 5.910553  Lock_time: 0.000210 Rows_sent: 10  Rows_examined: 1337315SET timestamp=1552803869;select aid,sum(pv) as num from article_rank where day>=20181220 and day<=20181224 group by aid order by num desc limit 10;`

# Solution 3: Use SQL_BIG_RESULT

Due to the large number of query results, we specify in the optimizer that the temporary table uses the disk storage directly.

`# Query_time: 6.144315  Lock_time: 0.000183 Rows_sent: 10  Rows_examined: 2122417SET timestamp=1552802804;select SQL_BIG_RESULT aid,sum(pv) as num from article_rank where day>=20181220 and day<=20181224 group by aid order by num desc limit 10;`

# Conclusion

Solution 1 does not provide stable performance. When the data rows in the entire table is the same as the number of rows in the query range and the temporary table size limit is not reached, this solution provides the best performance. The higher the ratio of the query data to the total data volume of the entire table, the lower the optimization.

`# SQL1select aid,sum(pv) as num from article_rank where day>=20181220 and day<=20181224 group by aid order by num desc limit 10;# SQL2select aid,sum(pv) as num from article_rank force index(idx_aid_day_pv) where day>=20181220 and day<=20181224 group by aid order by num desc limit 10;`
1. SQL1 uses full-field sorting and does not require the retrieval of the data in the table again. Then why do we need to add 10 to obtain the total number of scanned rows?
2. Since the number of rows after the `group by` operation is `552,203` for both SQL1 and SQL2, why does SQL1 have the problem of insufficient memory? What are details behind this problem?
3. `creating_tmp_table.tmp_table_info.row_limit_estimate` in trace shows `838,860`; this calculation is based on the temporary table size limit (`16 MB`). Because one row occupies 20 bytes, the total number of rows that can be included is `838860 (16777216/20)`. However, we actually need to put `785,102` rows into the temporary table. Why?
4. Why does the number of rows to be scanned in the original table need to be multiplied by 2 after SQL1 uses `SQL_BIG_RESULT` for optimization? Why does writing into the temporary table in the memory alone lead to a 10× performance difference?
5. From the source code, I find that the number of scanned rows shown in trace is often not the actual number of rows. I do not understand why the actual number of scanned rows or the real capacity is not recorded in trace, for example, the number of scanned rows in SQL1: `filesort_priority_queue_optimization.rows_estimate`. I see the calculation rule in gdb, as shown in the following figure.
6. Can we use some tool to count the I/O during the execution of the SQL statement?

# Original Source

--

-- ## Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com