MySQL Practices: How to Handle Slow SQL Execution Due to Time Zone Setting

Image for post
Image for post

By Tian Jie

Long database query times always trigger application access problems. However, we rarely pay attention to slow SQL execution caused by time zone setting. Thanks to one customer’s meticulousness and persistence, we have found the cause of the problem.

1) Terms

Image for post
Image for post

2) Symptoms

From 22:03:00 to 22:04:00 on March 19, 2020, a large number of active connections of MySQL 8.0 instances were accumulated, a large number of low-cost queries were recorded in slow logs, and the CPU utilization was low but the sys CPU utilization fluctuated abnormally.

Image for post
Image for post
Image for post
Image for post

3) Troubleshooting

3.1) Operating System

The following factors may cause lag:

  • Fluctuations on the operating system of the physical machine (troubleshoot based on the IO_WAIT metric)
  • MySQL mechanism

3.2) MySQL

After we exclude exceptions on the operating system, we start to analyze the call stacks of the MySQL process.

To better analyze MySQL operations, Alibaba databases provide the Bian Que system to track, count, and display the internal method calls of a specified process.

Image for post
Image for post

As shown in the preceding figure, 40.5% of the CPU time is consumed to call the Time_zone_system::gmt_sec_to_TIME() method, and the code is as follows:

void Time_zone_system::gmt_sec_to_TIME(MYSQL_TIME *tmp, my_time_t t) const {  struct tm tmp_tm;  time_t tmp_t = (time_t)t;  localtime_r(&tmp_t, &tmp_tm);  localtime_to_TIME(tmp, &tmp_tm);  tmp->time_type = MYSQL_TIMESTAMP_DATETIME;  adjust_leap_second(tmp);}

According to the code, note that localtime_to_TIME() and adjust_leap_second() are simple format conversion and computation operations and do not involve system calls.

localtime_r() involves the __localtime_r() method in the GNU C Library (glibc), and the code is as follows:

/* Return the `struct tm' representation of *T in local time,   using *TP to store the result.  */struct tm *__localtime_r (t, tp)     const time_t *t;     struct tm *tp;{  return __tz_convert (t, 1, tp);}weak_alias (__localtime_r, localtime_r)

The code for implementing __tz_convert() is as follows:

/* Return the `struct tm' representation of *TIMER in the local timezone. Use local time if USE_LOCALTIME is nonzero, UTC otherwise.  */struct tm *__tz_convert (const time_t *timer, int use_localtime, struct tm *tp){long int leap_correction;int leap_extra_secs;if (timer == NULL)
{
__set_errno (EINVAL);
return NULL;
}
__libc_lock_lock (tzset_lock);
/* Update internal database according to current TZ setting.
POSIX.1 8.3.7.2 says that localtime_r is not required to set tzname.
This is a good idea since this allows at least a bit more parallelism. */
tzset_internal (tp == &_tmbuf && use_localtime, 1);
if (__use_tzfile)
__tzfile_compute (*timer, use_localtime, &leap_correction,
&leap_extra_secs, tp);
else
{
if (! __offtime (timer, 0, tp))
tp = NULL;
else
__tz_compute (*timer, tp, use_localtime);
leap_correction = 0L;
leap_extra_secs = 0;
}
if (tp)
{
if (! use_localtime)
{
tp->tm_isdst = 0;
tp->tm_zone = "GMT";
tp->tm_gmtoff = 0L;
}
if (__offtime (timer, tp->tm_gmtoff - leap_correction, tp))
tp->tm_sec += leap_extra_secs;
else
tp = NULL;
}
__libc_lock_unlock (tzset_lock);
return tp;
}

According to the code, lock and unlock operations are performed. The code for __libc_lock_lock() is as follows:

#if IS_IN (libc) || IS_IN (libpthread)# ifndef __libc_lock_lock#  define __libc_lock_lock(NAME) \  ({ lll_lock (NAME, LLL_PRIVATE); 0; })# endif#else# undef __libc_lock_lock# define __libc_lock_lock(NAME) \  __libc_maybe_call (__pthread_mutex_lock, (&(NAME)), 0)#endif

The code for lll_lock() is as follows:

static inline void
__attribute__ ((always_inline))
__lll_lock (int *futex, int private)
{
int val = atomic_compare_and_exchange_val_24_acq (futex, 1, 0);
if (__glibc_unlikely (val ! = 0))
{
if (__builtin_constant_p (private) && private == LLL_PRIVATE)
__lll_lock_wait_private (futex);
else
__lll_lock_wait (futex, private);
}
}
#define lll_lock(futex, private) __lll_lock (&(futex), private)

According to the code, atomic_compare_and_exchange_val_24_acq() tries to lock futex.

As a memory area shared by multiple threads, futex triggers system calls and enters the system mode upon the contention of multiple client threads (multiple sessions and queries). This increases the sys CPU utilization.

In addition, the protection lock mechanism of the critical zone limits the concurrency of the time zone conversion method __tz_convert(). As a result, multiple sessions and queries are waiting to acquire the lock to enter the critical zone, which may cause lags upon conflicts or fierce contention.

But what triggers Time_zone_system::gmt_sec_to_TIME() calls?

The code of the Field_timestampf::get_date_internal() method is as follows:

bool Field_timestampf::get_date_internal(MYSQL_TIME *ltime) {
THD *thd = table ? table->in_use : current_thd;
struct timeval tm;
my_timestamp_from_binary(&tm, ptr, dec);
if (tm.tv_sec == 0) return true;
thd->time_zone()->gmt_sec_to_TIME(ltime, tm);
return false;
}

The virtual function gmt_sec_to_TIME() of base class Time_zone is called here to convert UTC seconds since Epoch to time format. As the name of the Field_timestampf::get_date_internal() method implies, the query should involve access to the timestamp data type.

Based on the preceding speculation, let’s verify the slow query and its data type.

# Slow query
SELECT
id,
......
create_time, update_time,
......
FROM mytab
WHERE duid IN (?,?,?,?,? )
and (state in (2, 3)
or ptype ! =0)
# Queried table
CREATE TABLE `mytab` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`duid` char(32) NOT NULL,
......
`state` tinyint(2) unsigned NOT NULL DEFAULT '0',
`ptype` tinyint(4) NOT NULL DEFAULT '0',
`create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
......,
PRIMARY KEY (`id`),
) ENGINE=InnoDB

According to the preceding information, both the create_time and update_time fields are timestamps.

4) Troubleshooting

According to the previous analysis, the futex lock contention on the operating system due to the Time_zone_system::gmt_sec_to_TIME() method causes low-cost queries to be slow. To avoid calling the method, change the value of the time_zone parameter from the system to the local time zone in the instance console. For example, change the value to +8:00.

Then the Time_zone_offset::gmt_sec_to_TIME() method is called to compute MySQL data. This avoids locking and unlocking in the operating system caused by access to glibc functions.

A performance comparison before and after this modification is shown in the charts below (by comparing the time needed to query timestamps the same number of times).

About 15 minutes when time_zone is set to the system:

Image for post
Image for post

About 5 minutes when time_zone is set to +8:00:

Image for post
Image for post

5) Best Practices

If a high-concurrency application involves high-frequency access to timestamps:

  • If the data type is the timestamp, we recommend setting the time_zone parameter to a value in the UTC or GMT offset format in the console, such as +8:00. This effectively reduces the execution overhead of highly concurrent queries as well as the response time (RT).
  • After MySQL 5.7, Datetime types contain the default value of timestamp types and support the on update current_timestamp property. Therefore, we recommend using Datetime types instead of Timestamp types.

Original Source:

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store