Hi Zeki,
ntasks-per-node is how many mpi processes to start on a single node. ntasks is the total number of processes running across all nodes. 4 nodes * 24 mpi processes per node = 96 cores in Susmita's example, which would give you 96 tasks. Originally, you were asking for 4 nodes to execute a serial process (ntasks=1), and so the scheduler yelled at you since that is a big waste of resources.
If you are using mpirun, you do need to specify the number of tasks beforehand. So for you, it might be something like this:
#SBATCH --clusters=AA
#SBATCH --account=AA
#SBATCH --partition=AA
#SBATCH --job-name=AA
#SBATCH --nodes=4
#SBATCH --ntasks=80
#SBATCH --time=120:00:00
source /truba/sw/centos6.4/comp/
intel/bin/ intel6i4
source /truba/sw/centos6.4/lib/impi/<>
module load centos6.4/app/namd/2.9-multicore
module load centos6.4/lib/impi/4.1.1
#$NAMD_DIR/namd2 +p$OMP_NUM_THREADS namd_input.conf > namd_multinode_output.log for single node
mpirun -np 80 $NAMD_DIR/namd2 namd_input.conf > namd_multinode_output.log
Unfortunately it sometimes isn't that clear. Some slurm machines require using srun instead of mpirun, and that is something that is specific to the supercomputer configuration. Susmita provided one example configuration. What I found helpful when first dealing with slurm was the manual page. NERSC has what I think is an especially good overview of common arguments and what they actually mean (, although most supercomputing centers provide their own examples of how to run stuff.
what does ntasks stand for ?
Dear Zeki,
If you are using single job on mutiple node, then I think you should use "export OMP_NUM_THREADS=1". I have given an example in the following SBATCH script:
#BATCH -A Name
#SBATCH --ntasks-per-node=24
#SBATCH --error=job.%J.err
#SBATCH --output=job.%J.out
#SBATCH --time=1-02:00:00
module load slurm
### To launch mpi job through srun user have to export the below library file##
export I_MPI_PMI_LIBRARY=/cm/shared/apps/slurm/15.08.13/lib64/
export LD_LIBRARY_PATH=/home/cdac/jpeg/lib:$LD_LIBRARY_PATH
#export PATH=/home/cdac/lammps_10oct2014/bin:$PATH
time srun -n 96 namd2 eq_NVT_run1.conf > eq_NVT_run1.log
I am trying to come up with a slurm script file for my simulation but I failed miserably. The point is that in the uni. super computer for a single node there exist 20 cores. What I want to do is for my single job, lets say aaaa.conf, I want to use 80 cores. However to allocate such numbers of cores I need to use 4 nodes (4*20=80). However slurm gives error if I try to run one task on multiple nods. How can I overcome this situation ?
#SBATCH --clusters=AA
#SBATCH --account=AA
#SBATCH --partition=AA
#SBATCH --job-name=AA
#SBATCH --nodes=4
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=80
#SBATCH --time=120:00:00
source /truba/sw/centos6.4/comp/intel/bin/ intel6i4
source /truba/sw/centos6.4/lib/impi/<>
module load centos6.4/app/namd/2.9-multicore
module load centos6.4/lib/impi/4.1.1
#$NAMD_DIR/namd2 +p$OMP_NUM_THREADS namd_input.conf > namd_multinode_output.log for single node
#$mpirun NAMD_DIR/namd2 +p$OMP_NUM_THREADS namd_input.conf > namd_multinode_output.log for multiple node
THE ABOVE SCRIPT GIVES ERROR as in --ntasks=1 is not valid. However if I make --ntasks=4 and --cpus-per-task=20 it works. But it does not enhance the run speed. (Note: each user can use at most 80 cores in the super computer server)
