Buy Me a Coffee

Tuesday, September 27, 2016

hung_task_timeout_secs and blocked for more than 120 seconds problem





Hi

Today we experienced an issue on Debian server, the file system hanged external mounts failed.

We saw "hung_task_timeout_secs" errors in /var/log/syslog:


INFO: task jbd2/dm-47-8:6937 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
jbd2/dm-47-8  D 000000000000000b     0  6937      2 0x00000080
 ffff8a1fd1363d20 0000000000000046 0000000000016700 0000000000016700
 ffff8a1fd34bd800 0000000000016700 0000000000016700 ffff8a1fd0493540
 ffff8a1fd0493af8 ffff8a1fd1363fd8 000000000000fb88 ffff8a1fd0493af8
Call Trace:
 [<ffffffff81096f8e>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa025a7cf>] jbd2_journal_commit_transaction+0x19f/0x14b0 [jbd2]
 [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
 [<ffffffff8105e759>] ? find_busiest_queue+0x69/0x150
 [<ffffffff81080fcc>] ? lock_timer_base+0x3c/0x70
 [<ffffffff81096ca0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0260f38>] kjournald2+0xb8/0x220 [jbd2]
 [<ffffffff81096ca0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0260e80>] ? kjournald2+0x0/0x220 [jbd2]
 [<ffffffff81096936>] kthread+0x96/0xa0
 [<ffffffff8100c0ca>] child_rip+0xa/0x20
 [<ffffffff810968a0>] ? kthread+0x0/0xa0
 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Kernel panic - not syncing: hung_task: blocked tasks


These  messages related to hardware failure or a software bug in the operating system: “A kernel may also go into panic() if it is unable to locate a root file system. 
During the final stages of kernel userspace initialization, a panic is typically triggered if the spawning of init fails, as the system would then be unusable.”

The explanation for this is that by default Linux uses up to 40% of the available memory for file system caching. After this mark has been reached the file system flushes all outstanding data to disk causing all following IOs going synchronous. For flushing out this data to disk this there is a time limit of 120 seconds by default. In the case here the IO subsystem is not fast enough to flush the data withing 120 seconds. As IO subsystem responds slowly and more requests are served, System Memory gets filled up resulting in the above error.

Two possible solutions:
1. Change the I/O scheduler – not related to our problem since the hang wasn’t related to /root filesystem.
Here is a solution from Redhat https://access.redhat.com/solutions/408833

2. Configure the vm.dirty_ratio (20% default) and vm.dirty_backgroud_ratio (10% default) to smaller values (10 and 5) so dumping data to disk will be faster.

Here is how to do it:


Change vm.dirty_ratio and vm.dirty_backgroud_ratio
someuser@servercore [/home/someuser]$ sudo sysctl -w vm.dirty_ratio=10
someuser@servercore [/home/someuser]$ sudo sysctl -w vm.dirty_background_ratio=5
Commit Change
someuser@servercore [/home/someuser]# sudo sysctl -p

Make it permanent

When the server seemed more stable and no Kernel/Swap/Memory Panic for a week, I edited /etc/sysctl.conf file to make these permanent after reboot.
someuser@servercore [/home/someuser]$ sudo vi /etc/sysctl.conf
ADD 2 lines at the bottom
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
Save and exit.
someuser@servercore [/home/someuser]$ sudo reboot
Source: http://blog.ronnyegner-consulting.de/2011/10/13/info-task-blocked-for-more-than-120-seconds/comment-page-1/

Source 2: https://www.blackmoreops.com/2014/09/22/linux-kernel-panic-issue-fix-hung_task_timeout_secs-blocked-120-seconds-problem/

. 

Some Linux tools




The list below is a collection of Linux tool i'm using for performance analysis
each command and tool syntax can be retrieved via "man COMMAND"

Process statistics:
/sys filesystem - documentation is in the kernel source/Documenation
/debugfs filesystems - documentation is in the kernel source/Documenation
  sar - Collect, report, or save system activity information.
  sadc - System activity data collector.
  sa1 - Collect and store binary data in the system activity daily data file.
  sa2 - Write a daily report in the /var/log/sysstat directory.
  sadf - Display data collected by sar in multiple formats.
  isag - Interactive System Activity Grapher
  mpstat - Report processors related statistics.
  vmstat - Report virtual memory statistics
  dstat - versatile tool for generating system resource statistics
  pidstat - Report statistics for Linux tasks.
  top - display Linux processes
  htop - interactive process viewer
  proc - process information pseudo-filesystem
  ps - report a snapshot of the current processes.
  pstree - display a tree of processes
  pgrep, pkill - look up or signal processes based on name and other attributes
  killall - kill processes by name
  killall5 - send a signal to all processes.
  kill - send a signal to a process
  nice - run a program with modified scheduling priority
  renice - alter priority of running processes
  snice, skill - send a signal or report process status
  atop - AT Computing's System & Process Monitor
  atopsar - AT Computing's System Activity Report (atop related)
  saidar - a curses-based tool for viewing system statistics
  nmon - systems administrator, tuner, benchmark tool
  nproc - show number of processors

How to get os version:
/etc/debian_version
/etc/lsb-release
/etc/issue
/etc/issue.net
lsb_release - print distribution-specific information
uname - print system information (and kernel version)
unity - wrapper for starting the unity shell and handling fallback

memory:
free - Display amount of free and used memory in the system
smem - Report memory usage with shared memory divided proportionally.

shared libraries:
sprof - read and display shared object profiling data

system:
uptime - Tell how long the system has been running.

locking and deadlocks:
mutrace - traces lock contention has no manual pages
https://wiki.linaro.org/Platform/DevPlatform/Tools/Mutrace
http://0pointer.de/blog/projects/mutrace.html

disk tools:
  df - report file system disk space usage
  du - estimate file space usage
  smartctl - Control and Monitor Utility for SMART Disks
  GSmartControl - Hard disk drive health inspection tool
  hddtemp - Utility to monitor hard drive temperature
  swapon, swapoff - enable/disable devices and files for paging and swapping
  fdisk - manipulate disk partition table
  cfdisk - display or manipulate a disk partition table
  sfdisk - partition table manipulator for Linux
  partx - tell the Linux kernel about the presence and numbering of on-disk partitions
  addpart - a simple wrapper around the "add partition" ioctl
  delpart - simple wrapper around the "del partition" ioctl
  partprobe - inform the OS of partition table changes
  GNU Parted - a partition manipulation program
  blkid - locate/print block device attributes
  findfs - find a filesystem by label or UUID
  wipefs - wipe a signature from a device

io tools:
  iostat - Report Central Processing Unit (CPU) statistics and input/output statistics for devices and partitions.
  iotop - simple top-like I/O monitor
  ionice - set or get process I/O scheduling class and priority
  hdparm - get/set SATA/IDE device parameters
  sdparm - access SCSI modes pages; read VPD pages; send simple SCSI commands.
  lsscsi - list SCSI devices (or hosts) and their attributes
  blktool - Display or change block device settings
  blktrace - generate traces of the i/o traffic on block devices
  btrace - perform live tracing for block devices
  blkiomon - monitor block device I/O based o blktrace data
  blkrawverify - verifies an output file produced by blkparse
  verify_blkparse - verifies an output file produced by blkparse
  blkparse - produce formatted output of event streams of block devices
  btt - analyse block i/o traces produces by blktrace
  bno_plot - generate interactive 3D plot of IO blocks and sizes
  btreplay - recreate IO loads recorded by blktrace
  btrecord - recreate IO loads recorded by blktrace
  ioreplay - IO traces replayer
  ioprofiler - same package as ioreplay (ioapps).
does not have a manual page.
graphical viewer for io logs
/proc/scsi/scsi - /proc filesystem area where you can find scsi info

network filesystems:
  nfsiostat - Emulate iostat for NFS mount points using /proc/self/mountstats
  cifsiostat - Report CIFS statistics.
  mountstats - Displays NFS client per-mount statistics
  nfsstat - list NFS statistics
in order to remount every mount point in /etc/fstab type: mount -a
Force unmount : mount -l /mount/point

cpu:
  taskset - retrieve or set a process's CPU affinity
  cpulimit - limits the CPU usage of a process
  schedtool - query and set CPU scheduling parameters
schedtop - scheduler statistics visualization tool
doesn't have a package in ubuntu yet
https://rt.wiki.kernel.org/index.php/Schedtop_utility
  sensors - print sensors information
  cpuid - Dump CPUID information for each CPU

process ids:
  pidof - find the process ID of a running program.
  strace - trace system calls and signals
  ltrace - A library call tracer
  sotruss - trace shared library calls through PLT
  pstack - print a stack trace of running processes

X11:
  xtrace - trace communication between X11 client and server

used files:
  fuser - identify processes using files or sockets
  lsof - list open files

debugggers:
  valgrind - a suite of tools for debugging and profiling programs

profilers:
  gprof - display call graph profile data
  valgrind - a suite of tools for debugging and profiling programs
  google-pprof - manual page for google-pprof (part of gperftools)

core dumps:
  CGDB - curses based frontend to GDB
  eclipse - extensible tool platform and Java IDE
  xxgdb - X window system interface to the gdb debugger.
  qtcreator - Integrated Development Environment for Qt
  gdb - The GNU Debugger
gdbtui - does not have a manual page of its own
comes from the same package as gdb  (gdb)
  gcore - Generate a core file of a running program
  resolve_stack_dump - resolve numeric stack trace dump to symbols

elfs:
  pahole - Shows and manipulates data structure layout.

kernel:
  crash - Analyze Linux crash dump data or a live system
  slabtop - display kernel slab cache information in real time
  kerneltop - shows kernel function usage in an interactive style like 'top'
  kexec - directly boot into a new kernel
  sysctl - configure kernel parameters at runtime
  getconf - Query system configuration variables
  arch - print machine hardware name (same as uname -m)

modules:
  lsmod - Show the status of modules in the Linux Kernel
  modinfo - Show information about a Linux Kernel module
  rmmod - Simple program to remove a module from the Linux Kernel
  insmod - Simple program to insert a module into the Linux Kernel
  modprobe - Add and remove modules from the Linux Kernel
  depmod - Generate modules.dep and map files.

networking:
  hostname - show or set the system's host name
  domainname - show or set the system's NIS/YP domain name
  ypdomainname - show or set the system's NIS/YP domain name
  nisdomainname - show or set the system's NIS/YP domain name
  dnsdomainname - show the system's DNS domain name
  ss - another utility to investigate sockets
  ping, ping6 - send ICMP ECHO_REQUEST to network hosts
  traceroute6 - traces path to a network host
  traceroute - print the route packets trace to network host
  itrace - similar to traceroute, yet uses ICMP echo
  jnettop - View hosts/ports taking up the most network traffic
  iftop - display bandwidth usage on an interface by host
  iptraf - Interactive Colorful IP LAN Monitor
  ngrep - network grep
  tcpdump - dump traffic on a network
  tshark - Dump and analyze network traffic
  wireshark - Interactively dump and analyze network traffic
  dumpcap - Dump network traffic
  rawshark - Dump and analyze raw pcap data
  editcap - Edit and/or translate the format of capture files
  mergecap - Merges two or more capture files into one
  text2pcap - Generate a capture file from an ASCII hexdump of packets
  ethtool - query or control network driver and hardware settings
  netstat - Print network connections, routing tables, interface statistics, masquerade connections, and multicast memberships
  nc — arbitrary TCP and UDP connections and listens
  route - show / manipulate the IP routing table
  ip - show / manipulate routing, devices, policy routing and tunnels
  ifconfig - configure a network interface
  ifup - bring a network interface up
  ifdown - take a network interface down
  ifquery - parse interface configuration
  iwconfig - configure a wireless network interface
  iwlist - Get more detailed wireless information from a wireless interface
  wavemon - a wireless network monitor
  telnet — user interface to the TELNET protocol
  vnStat - a console-based network traffic monitor
  vnStati - png image output support for vnStat
  netperf - a network performance benchmark
  iperf - perform network throughput tests
  nmcli - command‐line tool for controlling NetworkManager

Network firewalls:
  iptables/ip6tables — administration tool for IPv4/IPv6 packet filtering and NAT
  iptables-restore — Restore IP Tables
  ip6tables-restore — Restore IPv6 Tables
  iptables-save — dump iptables rules to stdout
  ip6tables-save — dump iptables rules to stdout
  iptables-apply - a safer way to update iptables remotely
  ufw - program for managing a netfilter firewall
  ufw-framework - using the ufw framework
  shorewall - Administration tool for Shoreline Firewall (Shorewall)

Network bridges:
  bridge - show / manipulate bridge addresses and devices
  brctl - ethernet bridge administration

java debugging:
  jstack - Prints Java thread stack traces for a Java process, core file, or remote debug server. This command is experimental and unsupported.
  jdb - Finds and fixes bugs in Java platform programs.
jlint - java source code lint
does not have a manual page
does have a package in ubuntu (jlint, jlint-doc)

programming tracing:
  mtrace - interpret the malloc trace log
  memusage - profile memory usage of a program
  dmalloc - program used to set the environment for debugging using the dmalloc debugging library.
  efence - Electric Fence Malloc Debugger

power management:
  powertop - a power consumption and power management diagnosis tool.
  turbostat - Report processor frequency and idle statistics
  upower - UPower command line tool

General tools (not intended for just one specific utility):
  dtrace - Dtrace compatibile user application static probe generation tool.
  stap-prep - prepare system for systemtap use
  staprun - systemtap runtime
  stap - systemtap script translator/driver
  perf - Performance analysis tools for Linux
  perf-stat - Run a command and gather performance counter statistics
  perf-top - System profiling tool.
  perf-record - Run a command and record its profile into perf.data
  perf-report - Read perf.data (created by perf record) and display the profile
  perf-list - List all symbolic event types
  sysprof, sysprof-cli - System-wide Linux Profiler
  sysdig - the definitive system and process troubleshooting tool
  oprofile - a system-wide profiler
  opreport - produce symbol or binary image summaries
  opannotate - produce source or assembly annotated with profile data
  oparchive - produce archive of oprofile data for offline analysis
  opgprof - produce gprof-format profile data

pipe performance:
  pipemeter - measure speed of data going through a pipe/redirection
  pv - monitor the progress of data through a pipe

watchdogs/fault helpers:
  catchsegv - Catch segmentation faults in programs

mysql:
  mytop - display MySQL server performance info like `top'
  sysbench - A modular, cross-platform and multi-threaded benchmark tool.

programming:
  cdecl — decode C type declarations
  HLint - haskell source code suggestions
  splint - A tool for statically checking C programs

DNS:
  dlint - Internet Domain Name System (DNS) error checking utility

elf tools:
  nm - list symbols from object files
  ldd - print shared library dependencies
  pldd - display dynamic shared objects linked into a process
  ld.so, ld-linux.so* - dynamic linker/loader
  objdump - display information from object files.
  objcopy - copy and translate object files
  ld - The GNU linker
  readelf - Displays information about ELF files.
  size - list section sizes and total size.
  chrpath - change the rpath or runpath in binaries
  patchelf - Modify ELF files

database admin:
tora - graphical toolkit for database developers and administrators
does not have a manul page
comes from the package 'tora'
  mysql - the MySQL command-line tool
  psql - PostgreSQL interactive terminal
  sqlite3 - A command line interface for SQLite version 3

realtime:
  chrt - manipulate the real-time attributes of a process
  hwlatdetect - program to control the kernel hardware latency detection module
  latencytop - a tool for developers to visualize system latencies
  cyclictest - High resolution test program

load generators:
  hackbench - scheduler benchmark/stress test

multi-user:
  who - show who is logged on
  w - Show who is logged on and what they are doing.
  last, lastb - show listing of last logged in users

listing stuff:
  lsscsi - list SCSI devices (or hosts) and their attributes
  lspci - list all PCI devices
  lsusb - list USB devices
  lsdev - display information about installed hardware
  socklist - display list of open sockets
  procinfo - display system statistics gathered from /proc
  lshw - list hardware
  lscpu - display information about the CPU architecture
  lsblk - list block devices
  pccardctl - PCMCIA card control utility
  lspcmcia - display extended PCMCIA debugging information
  nproc - print the number of processing units available
  x86info — display x86 CPU diagnostics
  dmidecode - DMI table decoder

gui:
  icinga - network/systems status monitoring daemon
  nagios3 - network/systems status monitoring daemon
  ksysguard - KDE System Monitor
  gnome-system-monitor — view and control processes
  xfce4-taskmanager - a task (system process) manager for Xfce.
procexp - linux process explorer
no package for ubuntu
  mrtg - What is MRTG ?
  Monit - utility for monitoring services on a Unix system
  munin - Munin manpage hub

filesystems:
  fsck - check and repair a Linux filesystem
  mkfs - build a Linux filesystem
  debugfs - ext2/ext3/ext4 file system debugger
  dumpe2fs - dump ext2/ext3/ext4 filesystem information
  dump - ext2/3/4 filesystem backup
  tune2fs - adjust tunable filesystem parameters on ext2/ext3/ext4 filesystems
  e2fsck - check a Linux ext2/ext3/ext4 file system
  mke2fs - create an ext2/ext3/ext4 filesystem
  e2label - Change the label on an ext2/ext3/ext4 filesystem
  ext2 - the second extended file system
  ext2 - the third extended file system
  ext4 - the fourth extended file system
  dd - convert and copy a file
  e4defrag - online defragmenter for ext4 filesystem

total system info:
  inxi  - Command line system information script for console and IRC

hardware:
  hardinfo - shows hardware information in a GTK+ window
  acpi - Shows battery status and other ACPI information
  psensor - Temperature monitoring application
  sensors - print sensors information
  xsensors - display hardware sensor information as a graphical read-out.
  memtester - stress test to find memory subsystem faults.
  memtest86+ - thorough real-mode memory tester (bootable tool, need to download and boot from it)

nvidia graphics card tools:
nvidia-smi
nvidia-settings

ATI graphics card tools:
fglrxinfo

services:
  service - run a System V init script
  initctl - init daemon control tool
  telinit - change system runlevel
  runlevel - output previous and current runlevel
  shutdown - bring the system down
  reboot, halt, poweroff - reboot or stop the system
  systemd, init - systemd system and service manager
  update-rc.d - install and remove System-V style init script links
  invoke-rc.d - executes System-V style init script actions
  Boot-Up Manager - Graphical runlevel configuration tool

security:
  chage - change user password expiry information

all round:
  glances - A cross-platform curses-based system monitoring tool