kernel/watchdog.c: print traces for all cpus on lockup detection
authorAaron Tomlin <atomlin@redhat.com>
Mon, 23 Jun 2014 20:22:05 +0000 (13:22 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Mon, 23 Jun 2014 23:47:44 +0000 (16:47 -0700)
commited235875e2ca983197831337a986f0517074e1a0
treea562c7ac13452a0573b87977df7b26a6471ece38
parentf3aca3d09525f87731ba6b892c9b010570bc54b4
kernel/watchdog.c: print traces for all cpus on lockup detection

A 'softlockup' is defined as a bug that causes the kernel to loop in
kernel mode for more than a predefined period to time, without giving
other tasks a chance to run.

Currently, upon detection of this condition by the per-cpu watchdog
task, debug information (including a stack trace) is sent to the system
log.

On some occasions, we have observed that the "victim" rather than the
actual "culprit" (i.e.  the owner/holder of the contended resource) is
reported to the user.  Often this information has proven to be
insufficient to assist debugging efforts.

To avoid loss of useful debug information, for architectures which
support NMI, this patch makes it possible to improve soft lockup
reporting.  This is accomplished by issuing an NMI to each cpu to obtain
a stack trace.

If NMI is not supported we just revert back to the old method.  A sysctl
and boot-time parameter is available to toggle this feature.

[dzickus@redhat.com: add CONFIG_SMP in certain areas]
[akpm@linux-foundation.org: additional CONFIG_SMP=n optimisations]
[mq@suse.cz: fix warning]
Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jan Moskyto Matejka <mq@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Documentation/kernel-parameters.txt
Documentation/sysctl/kernel.txt
include/linux/nmi.h
kernel/sysctl.c
kernel/watchdog.c