Cluster HOW-TO
- Cluster information
- Account setup
- Compiling and running jobs
- Tips for Optimizing Application Performance on the
Cluster
The CTBP cluster (ctbp1.ucsd.edu) consists of 120 Dell PowerEdge 2650 nodes
each with two 2.8GHz Xeon processors and 1GB or 2 GB RAM.
The cluster uses gigabit interconnects. The CTBP cluster is running the NPACI Rocks clustering software.
The Sun Grid Engine queuing system
has been
installed and configured on the cluster. All non-interactive jobs
must be submitted through the SGE. See SGE How-To for more information. Running non-interactive jobs outside of the the queuing system is violation of CTBP acceptable use policy.
If you see this message when running your jobs:
"/usr/X11R6/bin/xauth: error in locking authority file
/home/username/.Xauthority"
Put these lines to your .ssh/config file:
Host compute*
ForwardX11 no
Host c?-*
ForwardX11 no
In addition to the GNU compilers, high performance Intel
C/C++/F90/F95 compilers have been installed on the cluster
front-end (ctbp1.ucsd.edu) This
also includes the Intel Linux Debugger (LDB).
To use the compilers modify your Makefiles to use icc
or ifort
as the C/C++ and F77/F90 compilers, respectively.
You don't have to set up PATH or any other environment variables,
icc/ifort should work immediately.
Documentation for the Intel compilers, debugger and libraries can be
found in /soft/linux/share/intel/compiler80/docs. Please read
license information in this directory before using the compilers.
Note: If your code uses standard Unix services
(etime, call exit(), etc.) don't forget to
link the code with -Vaxlib options (which is not the
default!).
There are currently several versions of MPI libraries installed on the
cluster: intel's icc/ifc compiled MPICH (/opt/mpich/intel),
GNU's gcc/g77 compiled MPICH (/opt/mpich/gnu) and gcc/g77
compiled MPICH-MPD (/opt/mpich-mpd). You have to link your application with
the MPICH library compiled with the same compiler. Also you have to use
appropriate mpirun.
For example, if you want to use the Intel
compilers (which is recommended) you need to use mpif90 (or mpif77 or mpicc)
from /opt/mpich/intel/bin to compile your code. To actually
launch it on the cluster you have to use
/opt/mpich/intel/bin/mpirun in your SGE script.
General tips
Running parallel applications
- Don't run your parallel jobs on more than 8 CPUs. Most
application do not scale well on ethernet interconnected beowulf
clusters and there is a big penalty for intra-process
communication. The sweet spot for most applications appears to be
around 4-6 CPUs. Increasing number of CPUs above this number doesn't
decrease the wall clock time and can actually increase it (child
processes spending too much time communicating with each other and the
master process).
Please direct any questions or comments related to this web page to
ctbp-help @ ctbp.ucsd.edu
Last modified: September 19 2008 10:50:00 am.
|