Manpages

NAME

hwlatdetect - program to control the ftrace kernel hardware latency detection

SYNOPSIS

hwlatdetect [ --duration=<time> ] [--threshold=<usecs> ] [--window=<time interval> ] [--width=<time interval> ] [--hardlimit=<microsecond value> ] [--report=<path> ] [--debug ] [--quiet ] [--watch]

DESCRIPTION

hwlatdetect is a program that controls the ftrace kernel hardware latency detector (hwlatdetector).hwlatdetector is used to detect large system latencies induced by the behavior of certain underlying hardware or firmware, independent of Linux itself. The code was developed originally to detect SMIs (System Management Interrupts) on x86 systems, however there is nothing x86 specific about it. It was originally written for use by the "RT" patch set since the Real Time kernel is highly latency sensitive.

SMIs are usually not serviced by the Linux kernel, which typically does not even know that they are occurring. SMIs are instead are set up by BIOS code and are serviced by BIOS code, usually for "critical" events such as management of thermal sensors and fans. Sometimes though, SMIs are used for other tasks and those tasks can spend an inordinate amount of time in the handler (sometimes measured in milliseconds). Obviously this is a problem if you are trying to keep event service latencies down in the microsecond range.

The ftrace hardware latency detector works by hogging all of the cpus for configurable amounts of time (by calling stop_machine()), polling the CPU Time Stamp Counter for some period, then looking for gaps in the TSC data. Any gap indicates a time when the polling was interrupted and since the machine is stopped and interrupts turned off the only thing that could do that would be an SMI.

The hwlatdetector script manages the mounting/unmounting of the debugfs as well as interacting with the ftrace hwlatdetector If the debugfs is already mounted then hwlatdetector will not unmount it after a run.

OPTIONS

--duration=<time>{s,m,d}

Run the detector logic in for the specified duration. The duration is a base 10 integer number that defaults to a value in seconds. An optional suffix may be specified to indicate minutes, hours or days.

--threshold=<microsecond value>

Specify the TSC gap used to detect an SMI. Any gap value greater than <threshold> is considered to be the result of an SMI occurring.

--hardlimit=<microsecond value>

The test is considered to fail if a value above the hardlimit occurs. This affects the exit value of hwlatdetect

--window=<time value>{us,ms,s,m,d}

specify the size of the sample window. Converted to microseconds when passed to the kernel.

--width=<time value>{us,ms,s,m,d}

The amount of time within the sample window where the detector is actually sampling. Must be less than the --window value.

--report=FILENAME

Specify the output filename of the detector report. Default behavior is to print to standard output

--cpu-list=CPU-LIST

Specify the CPUs for hwlat thread to move across.

--debug

Turn on debug prints

--quiet

Turn off all information prints

--watch

print sample data to stdout as it arrives

AUTHOR

hwlatdetect was written by Clark Williams <williams [AT] redhat.com>
hwlat_detector.ko
was written by Jon Masters <jcm [AT] redhat.com>