Re: Query regarding performance

From: Daipayan Sarkar (sdaipayan_at_gmail.com)
Date: Tue Oct 04 2022 - 06:46:10 CDT

Hi Nicholus,

Just to confirm are you using any collective variables based on your choice of using NAMD 2.14? If not, as Rene suggests, NAMD3 should offer better performance.

-Daipayan

From: "owner-namd-l_at_ks.uiuc.edu" <owner-namd-l_at_ks.uiuc.edu> on behalf of René Hafner TUK <hamburge_at_physik.uni-kl.de>
Reply-To: "namd-l_at_ks.uiuc.edu" <namd-l_at_ks.uiuc.edu>, René Hafner TUK <hamburge_at_physik.uni-kl.de>
Date: Tuesday, October 4, 2022 at 5:50 AM
To: "namd-l_at_ks.uiuc.edu" <namd-l_at_ks.uiuc.edu>
Cc: Nicholus Bhattacharjee <nicholusbhattacharjee_at_gmail.com>
Subject: Re: namd-l: Query regarding performance

Multiple timestep scheme:

timestep 2; # timestep of 2fs
nonbondedFreq 1 ; # evaluation of nonbonded interactions in every step
fullElectFrequency 2; # electrostratics only evaluated every 2nd step

The factor of 2 may be explained by the power of the V100 vs. P40 but I have no P40 at hand.

If you want/need to run for a real long time then consider HMR and/or NAMD3 if applicable, see below.

Regards

René
On 10/4/2022 11:07 AM, Nicholus Bhattacharjee wrote:
Hello Rene,

Thanks for the reply. It seems to be a huge difference.

I am using the following for 80k system

namd 2.14 multicore cuda
CPU 32 Xeon 6242
GPU Nvidia Tesla p40 (x1)
Timestep 1fs

What do you mean by nonbeval,fullelecteval
Is it

nonbindedFreq 1
fullElectFrequency 4

I am getting 12.5 ns/day

With increasing the timestep to 2 I can get around 25 ns/day. But still less than what you mentioned.

Please let me know.

Thanks

On Tue, Oct 4, 2022, 10:45 AM René Hafner TUK <hamburge_at_physik.uni-kl.de<mailto:hamburge_at_physik.uni-kl.de>> wrote:

Hi Nicholus,

this depends on:

  * exact NAMD Version (2.x or 3.alphaY)
  * GPU type
  * CPU type
  * timestepping scheme
  * atom number

example one on a cluster I have access to:

  * using NAMD2.14 multicore CUDA (but selfcompiled)
  * GPU: 1xV100
  * CPU: 24xCores on XEON_SP_6126
  * timestep,nonbeval,fullelecteval: 2-1-2
  * with ~80k atoms

Results in: ~56ns/day (averaged value over multiple runs)

Though there may be space to tweak for you (cannot compare without knowing your GPU and namd version).

Depending on the system, simulation features you need and quantities you're interested in you may try out HMR (http://www.ks.uiuc.edu/Research/namd/mailing_list/namd-l.2019-2020/0735.html) and/or NAMD3 (https://www.ks.uiuc.edu/Research/namd/alpha/3.0alpha/ ).

Cheers

René

On 10/3/2022 10:13 AM, Nicholus Bhattacharjee wrote:
Hello everyone,

I am running a system of around 80k atoms with namd cuda. I am using 32 CPUs with 1 GPU. I am getting a performance of around 12.5 ns/day. I would like to know if this is fine or I am getting a bad performance.

Thank you in advance.

Regards

--
--
Dipl.-Phys. René Hafner
TU Kaiserslautern
Germany
--
--
Dipl.-Phys. René Hafner
TU Kaiserslautern
Germany

This archive was generated by hypermail 2.1.6 : Tue Dec 13 2022 - 14:32:44 CST