Getting Started
Beowulf Accounts
System Use
Allowed Use of
Nodes
Environment,
Filesystems, and Home Directories
Compiling
and Running a Simple MPI Program
Submitting an MPI
Program via PBS
Submitting a
Sequential Job
Return to Beowulf Home Page
Beowulf Accounts
All math users have an account on beowulf, but must request a home directory before any computing can be done with the cluster, send email to trouble@math.cudenver.edu.
If you do not have a math account, write trouble@math.cudenver.edu to
request one.
The account
and password policies of CCM apply.
System Use
Connect to the cluster by ssh beowulf.cudenver.edu . This will
protect your session from network sniffing and also set your X windows
authorization correctly. You have the same password on math and Beowulf.
You will be logged into the master node. Use ssh or rsh to go to other
nodes.
If you run the ssh_pass command in your home
directory on math, you can log in to the cluster without being prompted
for a password.
Use of the cluster relies on cooperation of users so that we do not
interfere with each other. Please follow the guidelines for use
of nodes and running programs below.
The cluster is
an experimental creature and system configuration is in progress, so the
cluster may go down or become inaccessible with little or no notice. Availability
to meet any kind of deadlines cannot be guaranteed.
Allowed Use of Nodes
1. master: compilation, editing. No computations.
2. node1 and node2: manually started jobs, including debugging
3. node3 -- node35: jobs submitted to PBS only. No manually
started
jobs.
Environment,
Filesystems, and Home Directories
1. These instructions are written for 'bash' shell. .
2. User home directories for the Beowulf cluster are located in the /data/home directory and it is
shared between all the nodes ( /home is a soft link to /data/home.) The directory /data/home is now on a redundant disk array, but it is still recommended that users keep copies of important files elsewhere, since
nothing is regularly
backed up. The directory /data is shared by all
nodes.
3.The filesystems on Beowulf are separate from the math server (users home directories will differ from their math home directory.) The math home directories are not mounted on the Beowulf master node. Users can only copy files from a Beowulf directory to their math home directory with sftp, scp, rsync etc. For example:
rsync -av -e ssh </data/home/input> math:<destination directory>
Compiling and Running a Simple MPI
Program
The best way to learn MPI is to
refer to Using MPI - 2nd Edition : Portable Parallel Programming with the
Message Passing Interface by William Gropp, Ewing Lusk, Anthony
Skjellum.
In a parallel program each cpu has its own memory and runs its own code.
The standard libraries of a sequential language like C do not have any
capability for passing messages between the many cpu's working during the
execution of a parallel program. Therefore an additional interface is needed
to handle the send/receive messages between computers. This interface,
called the Message Passing Interface (MPI), is a library of functions that
can be linked to an executable during compilation. The following example
gives step by step instructions for compiling and executing a simple C
program that uses the MPI library.
Example 1 (this example uses "mpich2" mpi implementation residing at /data/mpich2)
-
Set up your path. This assumes bash as your shell.
> echo "" >> .bashrc
> echo PATH=/data/mpich2/bin:/usr/pbs/bin:$PATH >> .bashrc
> echo export PATH >> .bashrc
Then logout and login again.
-
Copy the example file to your Beowulf home directory.
> cp /data/mpich/mpich-1.2.6/examples/basic/cpi.c .
This example comes from mpich-1.2.6 (another mpi implementation), but we shall compile and run it with mpich2.
-
Compile and link the appropriate MPI libraries to the example source code.
> mpicc -o cpi cpi.c
-
Manually submit the executable code to the cluster. This parallel program
uses 4 processors on 2 different nodes (node1 and node2). Please do not run any extensive computations on master node.
> mpiexec -n 2 -host node1 ./cpi : -n 2 -host node2 ./cpi
See the file /data/mpich2/README for more details on mpich2
Submitting an MPI Program via PBS
PBS holds a job until the required number of nodes are available and
determines which of the available nodes will handle a particular job.
It is important to use PBS because it prevents concurrent jobs from interfering
with performance results and from bogging down any one node of
the cluster. The previous example ran the executable a.out manually, to
run on more nodes of the cluster one must use PBS. The following
example demonstrates how to submit "cpi" to the cluster via PBS.
Example 2
-
Create the following "batch script" with your favourite editor in your home dir at beowulf (name it "cpi.pbs"):
#!/bin/sh
#PBS -l nodes=32:ppn=2
NEW_NAME=~/`basename $PBS_NODEFILE`
sort -n --key=1.5 $PBS_NODEFILE > $NEW_NAME
mpdrun -np 64 -hf $NEW_NAME -1 ~/cpi
rm $NEW_NAME
-
First line of the script is interpreter specification, in most cases you won't need to change it. Just use 'bash' syntax further in the script.
-
Second line tells PBS that your task needs 32 nodes to execute, and on each node 2 processes will be started. This is the line you'll have to change according to how many nodes your application actually needs. For instance, if you need 10 nodes, use "PBS -l nodes=10:ppn=2" as your second line.
-
Next 2 lines sort the list of nodes on which your program will run (this list is supplied by PBS through the "PBS_NODEFILE" environment variable). This sorting groups together the occurences of the same node in nodelist (each node is listed twice, since 2 processes will run per node). This is done in order to place neighbouring processes on the same node, wherever possible. You don't need to change these lines if you don't care about such details.
-
5-th line actually starts mpi job. "-np 64" is total number of processes in your job. It's reasonable, of course, to set this number to 2*N or 2*N-1, where N is the number of nodes that you requested on line 2. "-hf $NEW_NAME" tells mpich2 to use the presorted nodelist (see lines 3 and 4), "-1" tells mpich2 not to run any processes from this job on master node. "~/cpi" is path to executable - please note that it should be full path. I.e. if "cpi" resided in "examples" subdir of your home dir, you would have to use "~/examples/cpi" instead (even if your "cpi.pbs" resided in "examples", too).
Submit your job with a command
> qsub cpi.pbs
When the job is submitted PBS assigns it a job number
(xxxx.master.cluster), which is returned to standard output.
Check the files "cpi.pbs.eXXX" and "cpi.pbs.oXXX" for errors and program output, respectively.
Additional Notes:
(former motd)
All tcp/ip connections between nodes will use the high speed SCI interconnect
if the user environment includes LD_PRELOAD with value /opt/DIS/lib/libkscisock.so
If in the bash this can be set using:
export LD_PRELOAD=/opt/DIS/lib/libkscisock.so
Under tcsh use:
setenv LD_PRELOAD /opt/DIS/lib/libkscisock.so
if /usr/local/bin is in your PATH you can try some benchmarks using:
noscitestscript nodex
then try:
scitestscript nodex
where x is the number of the node you want to test in communication with the master node.
x can range from 1 to 35.
also see: /opt/DIS/doc/README_KSOCKET.txt
Russ Boice
trouble@math.cudenver.edu