This paper was presented by Brian Barrett at the High Performance
Computing Systems and Applications & OSCAR Symposium 2003, in
Sherbrooke, Quebec, Canada.
The growth of cluster computing as a viable option for high
performance computing has lead to the development of a comprehensive
software stack for these machines, including cluster scheduling,
parallel environments, and scientific libraries. OpenPBS or PBS/Pro
is often used for scheduling, with LAM/MPI or MPICH used for parallel
communication. This paper details the integration of the PBS
scheduling and resource managing infrastructure with the LAM/MPI
parallel run-time environment. The integration provides a cluster
with several features that, although commonly available on traditional
supercomputers, have been conspicuously missing in cluster computing
environments: fast job startup, proper resource cleanup, and detailed
PDF tends to render better than postscript (for us), but both should
print equally well.