From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Mon Sep 16 2013 - 02:05:34 CDT
Did you notice the bad scaling across nodes? I guess you only use a gigabit
ethernet ,right? Also, what you call the biggest advantage ratio 8:1, has in
fact the lower speedup. The improvement in time comes due the additional
processor power, not the gpu, so best test case for measuring the benefit of
using gpus against cpu only, is the 1:1 ratio and 5.7 is quite nice and also
the rest looks reasonable. Did you use the +devices or +ignoresharing flag?
What settings did you use for fullelectfrequency?
Norman Geist.
Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von Neeraj Agrawal
Gesendet: Sonntag, 15. September 2013 01:59
An: namd-l_at_ks.uiuc.edu
Betreff: namd-l: NAMD PERFORMANCE ON NVIDIA K20 GPU
Hello,
I recently performed few benchmark NAMD runs on a workstation (Dual 8-core
Xeon E5-2687W, 3.1 GHz with one Nvidia Tesla K20C GPU). Below are the
results:
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------
System Size: 85,000 atoms
number of CPU only CPU + K20c Speed-up
processors (days/ns) (days/ns) from GPU
4 1.19 0.21 5.7
8 0.62 0.18 3.4
16 0.33 0.21 1.6
32 0.29 0.23 1.3
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------
System Size: 6300 atoms
number of CPU only CPU + K20c Speed-up
processors (days/ns) (days/ns) from GPU
4 0.086 0.087 1.0
8 0.05 0.02 2.5
16 0.029 0.02 1.5
32 0.032 0.017 1.9
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------
In all these simulations, outputEnergy is written every 100th frame and
cutoff is set to 12 A. The results of CPU only NAMd were obtained by using
Linux-x86_64 (version 2.9) and results of CPU+GPU were obtained by using
Linux-x86_64-multicore-CUDA (version 2.9)
Since, in the future, I will be simulating solvated proteins with around
50K-70K atoms (in total), would it be reasonable to conclude the following
based on the above benchmark results:
1. The biggest advantage of GPU is seen when one GPU is used per 8 cores.
2. It might be advantageous to add one more GPU to this workstation so that
I can run two NAMD simulations (each on 8 procs + 1 GPU) simultaneously ?
3. For a system with <80k atoms, hyper-threading can deteriorate the
performance.
Thank you,
Neeraj
This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:23:44 CST