Problem with kernel_mutex in MySQL 5.1 and MySQL 5.5 is known: Bug report. In fact in MySQL 5.6 there are some fixes that suppose to provide a solution, but MySQL 5.6 yet has long way ahead before production, and it is also not clear if the problem is really fixed.
Meantime the problem with kernel_mutex is raising, I had three customer problems related to performance drops during the last month.
So what can be done there ? Let’s run some benchmarks.
But some theory before benchmarks. InnoDB uses kernel_mutex when it starts/stop transactions, and when InnoDB starts the transaction, usually there is loop through ALL active transactions, and this loop is inside kernel_mutex. That is to see kernel_mutex in action, we need many concurrent but short transactions.
For this we will take sysbench running only simple select PK queries against 48 tables, 5,000,000 rows each.
Hardware is Cisco UCS C250 server. The workload is read-only and fully in memory.
There is the result for different threads (against Percona Server 5.5.17):
Threads | Throughput, q/s |
---|---|
1 | 11178.34 |
2 | 27741.06 |
4 | 53364.52 |
8 | 92546.73 |
16 | 144619.58 |
32 | 164884.03 |
64 | 154235.73 |
128 | 147456.33 |
256 | 68369.02 |
512 | 40509.67 |
1024 | 22166.94 |
The peak throughput is 164884 q/s for 32 threads, and it declines to 68369 q/s for 256 threads, that is 2.4x times drop.
The reason, as you may guess, is kernel_mutex. How you can see it ? It is easy. In SHOW ENGINE INNODB STATUS\G
you will see a lot of lines like:
--Thread 140370743510784 has waited at trx0trx.c line 1184 for 0.0000 seconds the semaphore: Mutex at 0x2b0ccc8 '&kernel_mutex', lock var 1 waiters flag 0 --Thread 140370752542464 has waited at trx0trx.c line 1772 for 0.0000 seconds the semaphore: Mutex at 0x2b0ccc8 '&kernel_mutex', lock var 1 waiters flag 0 --Thread 140088222295808 has waited at trx0trx.c line 1184 for 0.0000 seconds the semaphore: Mutex at 0x2b0ccc8 '&kernel_mutex', lock var 1 waiters flag 0 --Thread 140370746922752 has waited at trx0trx.c line 1184 for 0.0000 seconds the semaphore: Mutex at 0x2b0ccc8 '&kernel_mutex', lock var 1 waiters flag 0 --Thread 140088223500032 has waited at trx0trx.c line 1184 for 0.0000 seconds the semaphore: Mutex at 0x2b0ccc8 '&kernel_mutex', lock var 1 waiters flag 0 --Thread 140088231528192 has waited at trx0trx.c line 795 for 0.0000 seconds the semaphore: Mutex at 0x2b0ccc8 '&kernel_mutex', lock var 1 waiters flag 0 ...
This problem is actually quite serious. In the real workloads I saw this happening with less than 256 threads, and not all production systems can tolerate 2x times drop of throughput in the peak times.
So what can be done there ?
In the first try, let’s recall that kernel_mutex (and all InnoDB mutexes) has complex handling with spin loops, and there are two variables that affects mutex loops: innodb_sync_spin_loops and innodb_spin_wait_delay. I actually think that tuning system with these variable is something closer to dance with drum than to scientific method, but nothing else helps, why not to try.
There we vary innodb_sync_spin_loops from 0 to 100 (default is 30):
Threads | Throughput | NA |
---|---|---|
1 | 11178.34 | |
2 | 27741.06 | |
4 | 53364.52 | |
8 | 92546.73 | |
16 | 144619.58 | |
32 | 164884.03 | |
64 | 154235.73 | |
128 | 147456.33 | |
256 | 68369.02 | |
512 | 40509.67 | |
1024 | 22166.94 |
I was surprised to see that with innodb_sync_spin_loops=100 we can improve to 145324 q/s , almost to peak throughput from first experiment.
With innodb_sync_spin_loops=100 the kernel_mutex is still the main point of contention, but InnoDB tries to prevent the current thread from pausing, and that seems helping.
Further experiments showed that 100 is not enough for 512 threads, and it should be increased to 200.
So there is final results with innodb_sync_spin_loops=200 for 1-1024 threads.
Threads | Throughput | Throughput spin 200 |
---|---|---|
1 | 11178.34 | 11288.42 |
2 | 27741.06 | 28387.62 |
4 | 53364.52 | 53575.52 |
8 | 92546.73 | 92184.65 |
16 | 144619.58 | 143688.91 |
32 | 164884.03 | 164392.94 |
64 | 154235.73 | 154022.57 |
128 | 147456.33 | 152280.84 |
256 | 68369.02 | 150089.31 |
512 | 40509.67 | 127680.65 |
1024 | 22166.94 | 61507.08 |
So playing with this variable we can double throughput to the level with 32-64 threads.
I am not really can explain how it does work internally, but I wanted to show one of possible ways
to deal with problem when you hit by kernel_mutex problem.
Further direction I want to try to limit innodb_thread_concurrency and also bind mysqld to less CPUs, and also it is interesting to see if MySQL 5.6.3 really fixes this problem.
The post kernel_mutex problem. Or double throughput with single variable appeared first on MySQL Performance Blog.