Manpages

NAME

nbdinfo − ENBD control and information file in /proc

SYNOPSIS

cat /proc/nbdinfo echo command > /proc/nbdinfo

DESCRIPTION

The nbdinfo file is an interface in /proc to many of the ENBD modules mode controls and accounting information and statistics.

ACCOUNTING INFORMATION AND STATISTICS

The output from /proc/nbdinfo is divided into sections, one per active ENBD device. The information pertaining to device nbd (or /dev/nd/b, if using the devfs scheme), for example, is headed by a single line saying

Device b:

Open

and information for that section is prefixed by a "[b]" at the beginning of each line of output.

Devices that are not active and have never been active will have a single abbreviated line of output corresponding to them, saying, for example

Device a:

Closed

Consecutive closed devices will be indicated with a range designator, forexample:

Device c-p:

Closed

Each section commences with a state indicator, showing the flags set on the device and other important variables.

[b] State:

flags

The flags include

uninitialized - the device structures have not yet been set up by the driver (this should indicate a memory error);

verify - the device has the right magic (i.e. it is not obviously corrupt);

signed - the client daemons have registered a servers signature;

rw/ro - the device is in readwrite (or readonly) mode;

merge requests - the device is aggregating incoming kernel requests to some limit specified by the merge_requests command (see section COMMANDS);

buffer writes - the device will buffer writes internally instead of passing them on to the remote resource. This mode is used to provide diskless node root file systems, which should be largely read only, with a few local modifications;

enabled - the device is in principle accepting kernel requests and passing them on to the remote server, which is in good health. When contact is lost, the device will be disabled;

validated - the partition table on the device (if any) has been scanned;

remote invalid - the remote resource has disappeared although we are still connected to the remote server and the latter is responding. This usually indicates that the remote resource is a removable media such as a cdrom or floppy, and it is being changed;

show_errs - the device will error out requests when there is a problem, instead of blocking them. This changes the behaviour when the device is disabled or the remote resource is unavailable;

direct - the device is using direct i/o. This is an experimental option;

plug - always shown (in 2.2 kernels and earlier was a kernel mode);

sync - the device is in synchronous mode;

md5sum - the device is currently in the mode where it uses md5summing techniques to accelerate writes;

acct - accounting is being performed for the device, as specified by the acct command (see section COMMANDS);

last error - in case the device has errored, an indicator of the last error;

lives - a count of the number of times in its lifetime that the device has totally disconnected from the remote and been reconnected. A high count may indicate network or other problems;

bp - always zero (in future kernels, a count of memory pages available).

After the State line, a line showing information on the kernel queues is shown. For example:

[b] Queued:

+0R/0W curr (check 0R/0W) +1R/0W max

The statistics show blocks, per read and per write.

The device uses a userspace buffer to communicate with the dameons. Its size is shown next. It is followed by lines showing the size of the device in bytes and blocks:

[b] Buffersize: 262144

(sectors=512, blocks=256)

[b] Blocksize:

1024

(log=10)

[b] Size:

4096KB

[b] Blocks:

4096

The next lines of output pertain to the individual client daemons, and the output is columnized.

When the device is in RAID mode the daemons will be organized into distinct groups, and the Groups line shows their group allegience. The Sockets line that follows shows the state of the individual connections. A "+" indicates a good client connection, and a "*" shows the last client to have been active. If the connection is bad, a "-" and a "." will be shown instead, respectively.

[b] Groups:

2

(0)

(0)

(1)

(1)

[b] Sockets:

4

(+)

(+)

(+)

(*)

The following lines are concerned with transfer statistics. The accounting is in blocks, and the total is at left, with subtotals for each daemon in the corresponding columns. The Requests line shows the number of kernel requests entering the device. At far right the total is broken down into read and write components, and the maximum seen in a single request is recorded. The Despatched line shows the number of requests satisfied. The number of write requests subjected to md5summing acceleratin is also shown. "eq" means that the md5sum technique determined that the source and target of the write were equal in content, and the write was skipped, "ne" means the write was not skipped, as the contents were not equal, and "dn" means that the remote server denied the md5sum request.

[b] Requested:

4

(0)

(0)

(4)

(0)

4R/0W

max 4

[b] Despatched:

4

(0)

(0)

(4)

(0)

4R/0W

md5 0W (0 eq, 0 ne, 0 dn)

[b] Errored:

0

(0)

(0)

(0)

(0)

0+0
[b] Pending:

0

(0)

(0)

(0)

(0)

0R/0W+0R/0W

The Errored and Pending lines show requests errored out by the device and requests queued internally, respectively. The "+" totals at the right distinguish between requests on the kernel queue and requests on the drivers internal queues. The kernel queue statistics are after the "+".

There follow lines showing the current device speeds, in bytes per second:

[b] B/s now:

0

(0R+0W)

[b] B/s ave:

0

(0R+0W)

[b] B/s max:

0

(0R+0W)

The next line breaks the requests total down per size of request. The size (in blocks) is given after the "%" and the percentage is before the "%".

[b] Spectrum:

100%4

There then follows internal state information about the number of threads of execution currently running through the device:

[b] Kthreads:

0

(0 waiting/0 running/1 max)

The same kind of information is shown for the user space client threads, but in more detail. A "+" indicates that the thread is currently blocked in kernel, presumably waiting on an event. A "-" indicates that the thread is out of kernel. It may indicate a client daemon death. The succeeding line shows the rocess IDs of the corresponding user space daemons.

[b] Cthreads:

4

(+)

(+)

(+)

(+)

[b] Cpids:

4

(1189)

(1190)

(1191)

(1192)

COMMANDS

The /proc/nbdinfo interface accepts certain instructions written to it. For example:

echo enable[b]=0 > /proc/nbdinfo

will disable device ndb.

The general format of a command is one of

command[letter] = value
command = value

In the latter case, the command applies to all (initialized, open) devices. Normally, the "letter" designates the target device. Numbers ("0", "1", ...) may be used instead of letters. Spaces around the equals sign are discarded. as are leading and trailing spaces.

In addition, the instruction "0" and "1" are emergency escapes which tell all devices to shut down. The "1" form also zeros the module reference counter, so the module may be removed from the kernel (expect a minor oops if there are other kernel components still referencing it, such as, for example, if the device were mounted in the file system).

The list of commands recognized is as follows (the list may change in future, check the write_proc code in the driver if in doubt):

merge_requests - the maximum extra number of blocks tobe aggregated per request, over the natural blocksize.

debug - if compiled in, turn on (1) or turn off (0) debugging on the device.

sync - (or "sync_intvl") the interval between forced device syncs. 0 indicates never.

show_errs - turn on (1) or turn off (0) the behaviour that errors out failed requests instead of retrying them later. This makes the difference beween erroring and blocking behaviour on i/o to a failed devoce.

plug - no longer used.

md5sum - put the device in (1) md5summing mode, or take it out (0). In any case the thresholds shown in /proc/sys/dev/endb/ variables still apply and may take the device out of md5summing mode after the threshold number of failures, or put it into md5summing mode after the threshold number of ordinary writes.

rahead - changes the number of blocks of read ahead performed on the device.

acct - turns on (1) or turns off (0) accounting on the device.

enable - turns on (1) or turns off (0) the device.

direct - puts the device in (1) or out of (0) direct i/o mode. Experimental.

zero - zeros (1) the accounting counters on the device.

setfaulty - marks the group given as an argument as faulty, when in a RAID1 configuration (i.e the Groups line shows more than one group). The group will not be written to, but when it comes back online, all missed writes will be caught up with.

hotremove - marks the RAID1 group given as argument as absent, as though a disk were being changed. When the group comes back on line, a complete resync will be performed.

hotadd - marks the RAID1 group given as argument as present, allowing resyncs to take place.

ERRORS

On an illegal command or argument value (out of bounds, malformed, etc.), the write to /proc/nbdinfo returns -EINVAL.

AUTHOR

Peter Breuer wrote the nbdinfo interface.

SEE ALSO

enbd-client(8), enbd.conf(5).