NAME
lamexec - Run non-MPI programs on LAM nodes.
SYNOPSIS
lamexec [-fhvD] [-c # | -np #] [-nw | -w] [-pty] [-s node] [-x VAR1[=VALUE1][,VAR2[=VALUE2],...]] [where] program [-- args] |
-c # |
Synonym for -np (see below). | ||
-D |
Use the executable program location as the current working directory for created processes. The current working directory of the created processes will be set before the user’s program is invoked. | ||
-f |
Do not configure standard I/O file descriptors - use defaults. | ||
-h |
Print useful information on this command. | ||
-np # |
(see below). Run this many copies of the program on the given nodes. This option indicates that the specified file is an executable program and not an application schema. If no nodes are specified, all LAM nodes are considered for scheduling; LAM will schedule the programs in a round-robin fashion, "wrapping around" (and scheduling multiple copies on a single node) if necessary. | ||
-nw |
Do not wait for all processes to complete before exiting lamexec. This option is mutually exclusive with -w. | ||
-pty |
Enable pseudo-tty support. Among other things, this enabled line-buffered output (which is probably what you want). The only reason that this feature is not enabled by default is because it is so new and has not been extensively tested yet. | ||
-s node |
Load the program from this node. This option is not valid on the command line if an application schema is specified. | ||
-v |
Be verbose; report on important steps as they are done. | ||
-w |
Wait for all applications to exit before lamexec exits. | ||
-x |
Export the specified environment variables to the remote nodes before executing the program. Existing environment variables can be specified (see the Examples section, below), or new variable names specified with corresponding values. The parser for the -x option is not very sophisticated; it does not even understand quoted values. Users are advised to set variables in the environment, and then use -x to export (not define) them. | ||
where |
A set of node and/or CPU identifiers indicating where to start | ||
-- args |
Pass these runtime arguments to every new process. This must always be the last argument to lamexec. This option is not valid on the command line if an application schema is specified. |
DESCRIPTION
lamexec is essentially a clone of the mpirun(1), but is intended for non-MPI programs.
One invocation of lamexec starts a non-MPI application running under LAM. To start the same program on all LAM nodes, the application can be specified on the lamexec command line. To start multiple applications on the LAM nodes, an application schema is required in a separate file. See appschema(5) for a description of the application schema syntax, but it essentially contains multiple lamexec command lines, less the command name itself. The ability to specify different options for different instantiations of a program is another reason to use an application schema.
Location
Nomenclature
The location nomenclature that is used for the where clause
mention in the SYNOPSIS section, above, is identical to
mpirun(1)’s nomenclature. See the mpirun(1) man page
for a lengthy discussion of the location nomenclature.
Note that the by-CPU syntax, while valid for lamexec, is not quite as meaningful because process rank ordering in MPI_COMM_WORLD is irrelevant. As such, the by-node nomenclature is typically the preferred syntax for lamexec.
Application
Schema or Executable Program?
To distinguish the two different forms, lamexec looks
on the command line for nodes or the -c option. If
neither is specified, then the file named on the command
line is assumed to be an application schema. If either one
or both are specified, then the file is assumed to be an
executable program. If nodes and -c both are
specified, then copies of the program are started on the
specified nodes according to an internal LAM scheduling
policy. Specifying just one node effectively forces LAM to
run all copies of the program in one place. If -c is
given, but not nodes, then all LAM nodes are used. If nodes
is given, but not -c, then one copy of the program is
run on each node.
Program
Transfer
By default, LAM searches for executable programs on the
target node where a particular instantiation will run. If
the file system is not shared, the target nodes are
homogeneous, and the program is frequently recompiled, it
can be convenient to have LAM transfer the program from a
source node (usually the local node) to each target node.
The -s option specifies this behavior and identifies
the single source node.
Locating
Files
LAM looks for an executable program by searching the
directories in the user’s PATH environment variable as
defined on the source node(s). This behavior is consistent
with logging into the source node and executing the program
from the shell. On remote nodes, the "." path is
the home directory.
LAM looks for an application schema in three directories: the local directory, the value of the LAMAPPLDIR environment variable, and laminstalldir/boot, where "laminstalldir" is the directory where LAM/MPI was installed.
Standard
I/O
LAM directs UNIX standard input to /dev/null on all remote
nodes. On the local node that invoked lamexec,
standard input is inherited from lamexec. The default
is what used to be the -w option to prevent conflicting
access to the terminal.
LAM directs UNIX standard output and error to the LAM daemon on all remote nodes. LAM ships all captured output/error to the node that invoked lamexec and prints it on the standard output/error of lamexec. Local processes inherit the standard output/error of lamexec and transfer to it directly.
Thus it is possible to redirect standard I/O for LAM applications by using the typical shell redirection procedure on lamexec.
% lamexec N my_app my_input my_output
The -f option avoids all the setup required to support standard I/O described above. Remote processes are completely directed to /dev/null and local processes inherit file descriptors from lamboot(1).
Pseudo-tty
support
The -pty option enabled pseudo-tty support for
process output. This allows, among other things, for line
buffered output from remote nodes (which is probably what
you want).
This option is not currently the default for lamexec because it has not been thoroughly tested on a variety of different Unixes. Users are encouraged to use -pty and report any problems back to the LAM Team.
Current
Working Directory
The current working directory for new processes created on
the local node is inherited from lamexec. The current
working directory for new processes created on remote nodes
is the remote user’s home directory. This default
behavior is overridden by the -D option.
The -D option will change the current working directory of new processes to the directory where the executable resides before the new user’s program is invoked.
An alternative to the -D option is the -wd option. -wd allows the user to specify an arbitrary current working directory (vs. the location of the executable). Note that the -wd option can be used in application schema files (see appschema(5)) as well.
Process
Environment
Processes in the application inherit their environment from
the LAM daemon upon the node on which they are running. The
environment of a LAM daemon is fixed upon booting of the LAM
with lamboot(1) and is inherited from the user’s
shell. On the origin node this will be the shell from which
lamboot(1) was invoked and on remote nodes this will be the
shell started by rsh(1). When running dynamically linked
applications which require the LD_LIBRARY_PATH environment
variable to be set, care must be taken to ensure that it is
correctly set when booting the LAM.
Exported
Environment Variables
The -x option to lamexec can be used to export
specific environment variables to the new processes. While
the syntax of the -x option allows the definition of
new variables, note that the parser for this option is
currently not very sophisticated - it does not even
understand quoted values. Users are advised to set variables
in the environment and use -x to export them; not to
define them.
EXAMPLES
lamexec N prog1
Load and execute prog1 on all nodes. Search for the executable file on each node.
lamexec -c 8 prog1
Run 8 copies of prog1 wherever LAM wants to run them.
lamexec n8-10 -v -nw -s n3 prog1 -- -q
Load and execute prog1 on nodes 8, 9, and 10. Search for prog1 on node 3 and transfer it to the three target nodes. Report as each process is created. Give "-q" as a command line to each new process. Do not wait for the processes to complete before exiting lamexec.
lamexec -v myapp
Parse the application schema, myapp, and start all processes specified in it. Report as each process is created.
lamexec N N -pty -wd /workstuff/output -x DISPLAY run_app.csh
Run the application "run_app.csh" (assumedly a C shell script) twice on each node in the system (ideal for 2-way SMPs). Also enable pseudo-tty support, change directory to /workstuff/output, and export the DISPLAY variable to the new processes (perhaps the shell script will invoke an X application such as xv to display output).
lamexec -np 5 -D ’pwd’/my_application
A common usage of lamexec in environments where a filesystem is shared between all nodes in the multicomputer, using the shell-escaped "pwd" command specifies the full name of the executable to run. This prevents the need for putting the directory in the path; the remote notes will have an absolute filename to execute (and change directory to it upon invocation).
DIAGNOSTICS
lamexec: Exec format error
A non-ASCII character was detected in the application schema. This is usually a command line usage error where lamexec is expecting an application schema and an executable file was given.
lamexec: syntax error in application schema, line XXX
The application schema cannot be parsed because of a usage or syntax error on the given line in the file.
filename: No such file or directory
This error can occur in two cases. Either the named file cannot be located or it has been found but the user does not have sufficient permissions to execute the program or read the application schema.
RETURN VALUE
lamexec returns 0 if all processes started by lamexec exit normally. A non-zero value is returned if an internal error occurred in lamexec, or one or more processes exited abnormally. If an internal error occurred in lamexec, the corresponding error code is returned. In the event that one or more processes exit with non-zero exit code, the return value of the process that lamexec first notices died abnormally will be returned. Note that, in general, this will be the first process that died but is not guaranteed to be so.
However, note that if the -nw switch is used, the return value from lamexec does not indicate the exit status of the processes started by it.