Buy Me a Coffee

Tuesday, September 27, 2016

hung_task_timeout_secs and blocked for more than 120 seconds problem


Today we experienced an issue on Debian server, the file system hanged external mounts failed.

We saw "hung_task_timeout_secs" errors in /var/log/syslog:

INFO: task jbd2/dm-47-8:6937 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
jbd2/dm-47-8  D 000000000000000b     0  6937      2 0x00000080
 ffff8a1fd1363d20 0000000000000046 0000000000016700 0000000000016700
 ffff8a1fd34bd800 0000000000016700 0000000000016700 ffff8a1fd0493540
 ffff8a1fd0493af8 ffff8a1fd1363fd8 000000000000fb88 ffff8a1fd0493af8
Call Trace:
 [<ffffffff81096f8e>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa025a7cf>] jbd2_journal_commit_transaction+0x19f/0x14b0 [jbd2]
 [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
 [<ffffffff8105e759>] ? find_busiest_queue+0x69/0x150
 [<ffffffff81080fcc>] ? lock_timer_base+0x3c/0x70
 [<ffffffff81096ca0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0260f38>] kjournald2+0xb8/0x220 [jbd2]
 [<ffffffff81096ca0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0260e80>] ? kjournald2+0x0/0x220 [jbd2]
 [<ffffffff81096936>] kthread+0x96/0xa0
 [<ffffffff8100c0ca>] child_rip+0xa/0x20
 [<ffffffff810968a0>] ? kthread+0x0/0xa0
 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Kernel panic - not syncing: hung_task: blocked tasks

These  messages related to hardware failure or a software bug in the operating system: “A kernel may also go into panic() if it is unable to locate a root file system. 
During the final stages of kernel userspace initialization, a panic is typically triggered if the spawning of init fails, as the system would then be unusable.”

The explanation for this is that by default Linux uses up to 40% of the available memory for file system caching. After this mark has been reached the file system flushes all outstanding data to disk causing all following IOs going synchronous. For flushing out this data to disk this there is a time limit of 120 seconds by default. In the case here the IO subsystem is not fast enough to flush the data withing 120 seconds. As IO subsystem responds slowly and more requests are served, System Memory gets filled up resulting in the above error.

Two possible solutions:
1. Change the I/O scheduler – not related to our problem since the hang wasn’t related to /root filesystem.
Here is a solution from Redhat

2. Configure the vm.dirty_ratio (20% default) and vm.dirty_backgroud_ratio (10% default) to smaller values (10 and 5) so dumping data to disk will be faster.

Here is how to do it:

Change vm.dirty_ratio and vm.dirty_backgroud_ratio
someuser@servercore [/home/someuser]$ sudo sysctl -w vm.dirty_ratio=10
someuser@servercore [/home/someuser]$ sudo sysctl -w vm.dirty_background_ratio=5
Commit Change
someuser@servercore [/home/someuser]# sudo sysctl -p

Make it permanent

When the server seemed more stable and no Kernel/Swap/Memory Panic for a week, I edited /etc/sysctl.conf file to make these permanent after reboot.
someuser@servercore [/home/someuser]$ sudo vi /etc/sysctl.conf
ADD 2 lines at the bottom
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
Save and exit.
someuser@servercore [/home/someuser]$ sudo reboot

Source 2: