NAMD Wiki: NamdOnGridEngine

  You are encouraged to improve NamdWiki by adding content, correcting errors, or removing spam.

If your NAMD binary was built using an MPI build of Charm++ then you should ignore the following and run NAMD like any other MPI program on your cluster. The following only applies to NAMD binaries that are not built on MPI and require the charmrun program to launch parallel runs. If your charmrun is a script that calls mpirun then you may ignore it.

Grid Engine

Grid Engine (multiplatform and free from http://gridengine.sunsource.net/) or Sun N1 Grid Engine 6 (commercial from http://wwws.sun.com/software/gridware/) is a direct descendent of the ancient DQS software in use at TCBG.

I found a brief tutorial on setting up SGE with Clustermatic/Bproc that might come in handy: http://noel.feld.cvut.cz/magi/sge+bproc.html

It is a little dated, and the tutorial doesn't show how to set it up for NAMD, but it gives the basics of making everything work.

Simple NAMD job on SGE 6.0

Use a text editor to create the file namd.job containing:

#$ -cwd
#$ -j y
#$ -S /bin/bash

nodefile=$TMPDIR/namd2.nodelist
echo group main > $nodefile
awk '{ for (i=0;i<$2;++i) {print "host",$1} }' $PE_HOSTFILE >> $nodefile

dir=$HOME/NAMD_2.6b1_Linux-i686
$dir/charmrun ++remote-shell ssh ++nodelist $nodefile +p$NSLOTS $dir/namd2 ~/apoa1/apoa1.namd

Since NAMD does not use MPICH, we need a small shell script and awk program to translate the SGE hostfile to charmrun format. The second column of the hostfile is the number of processors available, which is always one for these clusters, but this script will handle more.

Submit the job (for 3 procssors) with the command

qsub -pe mpich 3 namd.job

Note that we are pretending to use the mpich parallel environment, but we do not use any of the special files it sets up.

-JimPhillips

Tight integration

The SGE documentation specifies that, in order for parallel jobs to integrate tightly with the queue, one should use "qrsh -V -inherit" instead of rsh/ssh to spawn child processes. Because some starter applications (like MIPCH's mpirun) can't interface directly with qrsh, a wrapper script $SGE_ROOT/mpi/rsh is provided that should behave mostly identical to rsh. However, neither qrsh nor the wrapper script work with NAMD's charmrun. Here is something that does:

  • Create a wrapper script called "rsh_charm+" that contains the following 2 lines:
sed '/&$/s/ *&$//;/Exit 0/q' > tmp${JOB_ID}_$$
qrsh -V -inherit `echo $* | sed 's/-l \+\([^-][^ ]* \+\)\?//'` < tmp${JOB_ID}_$$ &
  • In your job script (you can use JimPhillips' job script above or any of the other job scripts below), make sure the charmrun/namd command-line contains ++remote-shell /path/to/rsh_charm+

Voila, instant tight integration. Enjoy!

-Kenno Vanommeslaeghe

PS. Some technical background for the die-hards: charmrun pipes a series of commands into rsh/ssh, that ultimately spawn a namd process in the background. It waits for rsh/ssh to exit, then uses its own TCP-based communication infrastructure to send the backgrounded namd processes the data they need to start working. The problem is that qrsh is designed to monitor and control the processes it spawns, and part of its job is to leave no processes behind when exiting. If we simply replace rsh/ssh by qrsh, qrsh will spawn a namd process in the background, then immediately kill it upon exiting. Here's where the rsh_charm+ wrapper comes in. The sed script on the first line removes the ampersand (&) from the command that spawns namd, so that it doesn't get backgrounded. However, if we just pipe the resulting commands into qrsh, qrsh will not exit, leaving charmrun waiting for qrsh to exit, qrsh waiting for namd to exit, and namd waiting from a signal from charmrun. In other words, the whole cruft will freeze upon startup. To prevent this from happening, charmrun's command sequence is redirected to a file, then a background qrsh process is launched that uses this file as its input (that's what the second line does). Now everybody is happy: the wrapper script exits after backgrounding qrsh, so that charmrun continues its duties, and qrsh stays running until namd exits, so no process gets killed prematurely.

Oh yeah, the `echo $* | sed 's/-l \+\([^-][^ ]* \+\)\?//'` on the second line is just to strip the "-l <user>" rsh/ssh directive from the command-line, because qrsh doesn't recognize it.

NAMD via SGE on a Linux cluster

I have ('finally' I should say) a working configuration for a Beowulf class I cluster (actually, a COW) with the OpenMosix kernel (2.4.22-3, complete with MFS & DFSA), Sun Grid Engine (SGE) 5.3 on top of it, and a working solution for running NAMD via SGE's mpich environment (using the tight integration). Although I really like OpenMosix, I seriously doubt that many people would be prepared to go through a complete OpenMosix installation (which may turn out to be far from trivial), so for what follows I'll forget everything about oM and will limit myself to just a description of how to set-up SGE to work with charmrun & NAMD. In summary :

  • Install and test SGE 5.3.
  • Install and test SGE's mpich parallel environment using the tight integration option (based on the SGE-provided rsh, see the mpi/ directory in SGE_ROOT).
  • Modify your submission script to look like this :
#!/bin/csh -f
# 

#
# The name of the job (NAMD_test) ...
#
#$ -N NAMD_test

#
# The parallel environment (mpi_rack1) and number of processors (9) ...
# 
#$ -pe mpi_rack1 9 

#
# The version of MPICH to use, transport protocol & a trick to delete cleanly
# running MPICH jobs ...
#
#$ -v MPIR_HOME=/usr/local/mpich-ssh
#$ -v P4_RSHCOMMAND=rsh
#$ -v MPICH_PROCESS_GROUP=no
#$ -v CONV_RSH=rsh 

#
# Execute from the current working directory ...
#
#$ -cwd

#
# Standard error and output should go into the current working directory ...
#
#$ -e ./
#$ -o ./

#
# Prepare nodelist file for charmrun ...
#
echo "Got $NSLOTS slots."
echo "group main" > $TMPDIR/charmlist
awk '{print "host " $1}' $TMPDIR/machines >> $TMPDIR/charmlist
cat $TMPDIR/charmlist

#
# Ready ...
#
/usr/local/NAMD_2.5/charmrun /usr/local/NAMD_2.5/namd2 ++nodelist $TMPDIR/charmlist +p $NSLOTS heat.namd > LOG

Run it : qsub myscript. You will (of course) have to change :

  • The name of your parallel execution environment
  • The location of NAMD's executables
  • The NAMD's script name
  • Note 1 : the standard output redirection (to LOG) is unneccessary
  • Note 2 : some of the environmental variables (like MPIR_HOME and P4_RSHCOMMAND) are irrelevant for this application.

With this setting you don't need to setup a ~/.nodelist file (as described in NAMD's notes). SGE will (rightly) decide where to run your job. I'll stop here to keep it short, but feel free to contact me in case none of this works on your cluster : glykos at mbg duth gr (Nicholas Glykos).

November 2004 : NAMD via charmrun on SGE 5.3/SGE 6.x on Linux or Apple OS X

These notes were added by ChrisDagdigian. These instructions are geared towards users of Grid Engine clusters where the primary transport mechanism is Gigabit Ethernet rather than MPI over a high speed, low-latency interconnect. The method shown below follows the NAMD author recommendations concerning the use of "charmrun" to take advantage of built-in TCP communication methods.

The following approach is slightly different from the above suggestions by Nicholas Glykos. It differs in 3 main ways:

  • The user's job script does not have to make the NAMD nodelist. This is done automatically by a parallel environment "job starter" script. This reduces the possibility for user-related job errors.
  • The NAMD nodelist will honor and reflect the number of job slots allocated by the Grid Engine scheduler
  • All references to MPI or MPICH are removed from the example as they are irrelevant and potentially confusing

Setting up the 'namd' parallel environment

Run the command shown below to create a new parallel environment called 'namd'. You must have Grid Engine management authority to create and modify PE configurations:

qconf -ap namd

On a Grid Engine 6.x cluster, the new PE should have the following configuration settings:

pe_name           namd
slots             8
user_lists        NONE
xuser_lists       NONE
start_proc_args   /common/sge/namd/start-namd.sh $pe_hostfile
stop_proc_args    /bin/true
allocation_rule   $fill_up
control_slaves    FALSE
job_is_first_task TRUE
urgency_slots     min

On a Grid Engine 5.3x cluster, the new PE should have the following configuration settings:

pe_name           namd
queue_list        all
slots             8
user_lists        NONE
xuser_lists       NONE
start_proc_args   /common/sge/namd/start-namd $pe_hostfile
stop_proc_args    /bin/true
allocation_rule   $fill_up
control_slaves    FALSE
job_is_first_task TRUE


Note: The PE configuration values that you should adjust for your local cluster include:

  • slots -- This is the total number of cluster job slots you want available within your cluster for NAMD jobs
  • queue_list -- (Grid Engine 5.3 only) The SGE queues where this PE is to be active/available
  • allocation_rule -- Read the SGE documentation concerning parallel environments. You may want to use "$round_robin" for instance if you want to spread the job out across as many machines as possible
  • start_proc_args -- You will likely have to change this path. It is convenient to have the "start-namd.sh" script somewhere in $SGE_ROOT

Special Grid Engine 6.x note

There is a big difference in how parallel environments are made available on specific Grid Engine queues between versions 5.3 and 6.0x. In Grid Engine 5.3x the parameter "queue_list" within the PE iteself lists the queues where the NAMD environment should be active/allowed.

In Grid Engine 6.x this is flipped around -- the list of supported/available parallel environments is kept within the queue configuration, not the parallel environment configuration. This means that Grid Engine 6.x users have to do one extra step after creating the 'namd' parallel environment. They must then associate the 'namd' PE with one or more queue instances or cluster queues.

Grid Engine 6 usually has a default "all.q" cluster queue. If one wanted to make the namd PE available cluster-wide one could modify the configuration settings for "all.q" by doing the following:

Run the command

qconf -mq all.q

Edit the value of "pe_list" to add "namd"

Verify that 'namd' is part of the "pe_list" for the chosen queue by running the command:

qconf -sq all.q


Job Starter script for the namd parallel environment

Save the following script to a file called "start-namd.sh". The embedded comments should explain what is happening but the short story is that Grid Engine creates hostfiles for parallel jobs that are not in the proper format for the NAMD 'charmrun' program to understand:

#!/bin/sh

## This is an init script for NAMD jobs run under Grid Engine on 
## clusters with Ethernet interconnects who want to launch parallel
## NAMD jobs that use the built in charm++ communication subsystem
## rather than MPI. Based on the SGI startmpi.sh example code.

## This script is necessary because SGE will create a custom hostfile
## for the NAMD job that is slightly incorrect with res pect to the format
## of the hostfile that the 'charmrun' program expects. 

## For example - a 5-CPU job spanning 2 nodes with 3 tasks on one
## system and 2 tasks on a second system would be written out
## by Grid Engine in the format:
##
##  compute01 3
##  compute02 2
##
## This format of "<hostname> <# slots>" is not recognized by 'charmrun' which would
## expect a file with this format to perform the same task:
##
## group main
##  host compute01
##  host compute01
##  host compute01
##  host compute02
##  host compute02
##
## The pupose of this init script is to take the custom hostfile generated
## by Grid Engine ($PE_HOSTFILE), transform it to the proper format and write
## it back out to the fileh $TMPDIR/namd-machines. (don't worry about collisions
## between multiple NAMD jobs ... $TMPDIR is unique for each job)
##
## Then we can run NAMD jobs that read $TMPDIR/namd-machines to learn their
## parallel host assignments...


PeHostfile2MachineFile()
{
   echo "group main"
   cat $1 | while read line; do
      # echo $line
      host=`echo $line|cut -f1 -d" "|cut -f1 -d"."`
      nslots=`echo $line|cut -f2 -d" "`
      i=1
      while [ $i -le $nslots ]; do
         echo " host $host"
         i=`expr $i + 1`
      done
   done
}

me=`basename $0`

# test number of args
if [ $# -ne 1 ]; then
   echo "$me: got wrong number of arguments" >&2
   exit 1
fi

# get arguments
pe_hostfile=$1

# ensure pe_hostfile is readable
if [ ! -r $pe_hostfile ]; then
   echo "$me: can't read $pe_hostfile" >&2
   exit 1
fi

# create machine-file
# remove column with number of slots per queue

machines="$TMPDIR/namd-machines"

PeHostfile2MachineFile $pe_hostfile >> $machines

# signal success
exit 0


Grid Engine NAMD wrapper script

The following is an example script showing how to run NAMD within Grid Engine. You will have to modify this script to suit your local cluster environment. Some things that you should look at include:

  • The "#$ -pe" line that requests the 'namd' parallel environment. The script is hard coded to ack for 5 CPUs. Adjust as needed. SGE will accept CPU ranges as well.
  • The "#$ -N" line where we give the job a name within Grid Engine. Delete or adjust as necessary
  • The actual "charmrun" command at the bottom of the script. This example code is hard-coded with paths to charmrun and namd that will likely not match your local environment.
  • The use of SSH within charmrun is explicit. You will need to delete or modify the "++remote-shell ssh" section if you are using RSH for remote shells.

Usage

qsub namd-sge.csh ./my-NAMD-input-file.inp

Wrapper: namd-sge.csh

#!/bin/csh -f

## Grid Engine NAMD 2.5 parallel environment integration script
##
##    Embedded SGE config settings below (begin with '#$')

##----------------------------------------------------------------------
# The name of our actual namd analysis script from the commandline
set NAMD_SCRIPT = $1
##----------------------------------------------------------------------

##----------------------------------------------------------------------
##    Embedded Grid Engine 'qsub' arguments to simplify things

# The name of the job (NAMD_test) ...
#$ -N NAMD_test

# The parallel environment (namd) and number of processors (5) ...
#$ -pe namd 5 

# Execute from the current working directory 
#$ -cwd
##----------------------------------------------------------------------


##----------------------------------------------------------------------
##        Important environment variables
##  SGE puts lots of info into a job environment, the stuff 
##  we care about will be:
##
##   $NSLOTS               -- how many CPUs we got
##   $NHOSTS               -- how many machines 
##   $PE_HOSTFILE          -- original Grid Engine hostfile
##   $TMPDIR/namd-machines -- hostfile customized for charmrun
##----------------------------------------------------------------------


##----------------------------------------------------------------------
##                  Real Work Begins Below
##----------------------------------------------------------------------

echo " ## DEBUG ## This NAMD program got $NSLOTS slots across $NHOSTS machines..."
echo "My customized namd-machines file has contents of:"
cat $TMPDIR/namd-machines
echo ""

##----------------------------------------------------------------------
##          Now we actually run the job
##----------------------------------------------------------------------


/usr/local/namd-2.5/charmrun /usr/local/namd-2.5/namd2 ++remote-shell ssh \
++nodelist $TMPDIR/namd-machines +p $NSLOTS $NAMD_SCRIPT > results.txt