FATAL ERROR: CudaTileListKernel::buildTileLists,

From: Tue Boesen (alyflex_at_gmail.com)
Date: Tue Mar 01 2022 - 01:14:10 CST

I'm trying to run energy minimization using NAMD 3.0alpha9 for
Linux-x86_64-multicore-CUDA, it works well for smaller systems, but I find
that for large systems I consistently get this error:

FATAL ERROR: CudaTileListKernel::buildTileLists, maximum shared memory
allocation exceeded. Too many atoms in a patch

I'm running the minimization on a Geforce RTX 3090 with 24GB memory, so I
believe I should have enough memory though it doesn't tell me exactly how
much it is using.

The system I'm minimizing has about 1.5M atoms, and consists of a protein
in a box of water with a few Na+ Cl- ions.

I have attached the logfile of the error below.

Does anyone have any good suggestions for how to run this minimization?

Cheers
Tue

Charm++> No provisioning arguments specified. Running with a single PE.
         Use +auto-provision to fully subscribe resources or +p1 to silence
this message.
Charm++: standalone mode (not using charmrun)
Charm++> Running in Multicore mode: 1 threads (PEs)
Charm++> Using recursive bisection (scheme 3) for topology aware partitions
Converse/Charm++ Commit ID: v6.10.2-0-g7bf00fa
Warning> Randomization of virtual memory (ASLR) is turned on in the kernel,
thread migration may not work! Run 'echo 0 >
/proc/sys/kernel/randomize_va_space' as root to disable it, or try running
with '+isomalloc_sync'.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (1 sockets x 8 cores x 2 PUs = 16-way SMP)
Charm++> cpu topology info is gathered in 0.000 seconds.
Info: Built with CUDA version 11000
Did not find +devices i,j,k,... argument, using all
Pe 0 physical rank 0 binding to CUDA device 0 on tue-ubuntu: 'NVIDIA
GeForce RTX 3090' Mem: 24265MB Rev: 8.6 PCI: 0:9:0
Info: NAMD 3.0alpha9 for Linux-x86_64-multicore-CUDA
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: for updates, documentation, and support information.
Info:
Info: Please cite Phillips et al., J. Chem. Phys. 153:044130 (2020)
doi:10.1063/5.0014475
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 61002 for multicore-linux-x86_64-iccstatic
Info: Built Sun Feb 28 21:57:49 CST 2021 by jmaia on manila.ks.uiuc.edu
Info: 1 NAMD 3.0alpha9 Linux-x86_64-multicore-CUDA 1 tue-ubuntu tue
Info: Running on 1 processors, 1 nodes, 1 physical nodes.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at 0.283695 s
Info: 0 MB of memory in use based on /proc/self/stat
Info: Using bitfields in atom data structures.
Info: sizeof( CompAtom ) = 32
Info: sizeof( CompAtomExt ) = 8
CkLoopLib is used in SMP with simple dynamic scheduling (converse-level
notification)
Info: Configuration file is
/media/tue/Data/Data/test_mini/RCSB/../relax_pdb//calc/AF_1W_1WU_1WUZ_1_A/namd/AF_1W_1WU_1WUZ_1_A.conf
Info: Changed directory to
/media/tue/Data/Data/test_mini/RCSB/../relax_pdb//calc/AF_1W_1WU_1WUZ_1_A/namd
TCL: Suspending until startup complete.
Warning: The following variables were set in the
Warning: configuration file but will be ignored:
Warning: paraTypeXplor (parameters)
Warning: paraTypeCharmm (parameters)
Info: Using TIP3P water model.
Warning: The Langevin gamma parameters differ over the particles,
Warning: requiring extra work per step to constrain rigid bonds.
Info: SIMULATION PARAMETERS:
Info: TIMESTEP 1
Info: NUMBER OF STEPS 0
Info: STEPS PER CYCLE 20
Info: PERIODIC CELL BASIS 1 281.56 0 0
Info: PERIODIC CELL BASIS 2 0 145.95 0
Info: PERIODIC CELL BASIS 3 0 0 286.421
Info: PERIODIC CELL CENTER -16.6987 1.34986 1.40648
Info: WRAPPING WATERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: LOAD BALANCER Centralized
Info: LOAD BALANCING STRATEGY New Load Balancers -- DEFAULT
Info: LDB PERIOD 4000 steps
Info: FIRST LDB TIMESTEP 100
Info: LAST LDB TIMESTEP -1
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: MIN ATOMS PER PATCH 40
Info: INITIAL TEMPERATURE 310
Info: CENTER OF MASS MOVING INITIALLY? NO
Info: DIELECTRIC 1
Info: EXCLUDE SCALED ONE-FOUR
Info: 1-4 ELECTROSTATICS SCALED BY 0.833333
Info: MODIFIED 1-4 VDW PARAMETERS WILL BE USED
Info: DCD FILENAME min1.dcd
Info: DCD FREQUENCY 200
Info: DCD FIRST STEP 200
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: NO EXTENDED SYSTEM TRAJECTORY OUTPUT
Info: NO VELOCITY DCD OUTPUT
Info: NO FORCE DCD OUTPUT
Info: OUTPUT FILENAME min1
Info: RESTART FILENAME min1.restart
Info: RESTART FREQUENCY 200
Info: BINARY RESTART FILES WILL BE USED
Info: CUTOFF 10
Info: PAIRLIST DISTANCE 16
Info: PAIRLIST SHRINK RATE 0.01
Info: PAIRLIST GROW RATE 0.01
Info: PAIRLIST TRIGGER 0.3
Info: PAIRLISTS PER CYCLE 2
Info: PAIRLIST OUTPUT STEPS 100
Info: PAIRLISTS ENABLED
Info: MARGIN 0.555
Info: HYDROGEN GROUP CUTOFF 2.5
Info: PATCH DIMENSION 19.055
Info: ENERGY OUTPUT STEPS 200
Info: ENERGY EVALUATION STEPS 200
Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
Info: MOMENTUM OUTPUT STEPS 200
Info: TIMING OUTPUT STEPS 200
Info: PRESSURE OUTPUT STEPS 200
Info: LANGEVIN DYNAMICS ACTIVE
Info: LANGEVIN TEMPERATURE 310
Info: LANGEVIN USING BBK INTEGRATOR
Info: LANGEVIN DAMPING COEFFICIENT IS 5 INVERSE PS
Info: LANGEVIN DYNAMICS NOT APPLIED TO HYDROGENS
Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
Info: TARGET PRESSURE IS 1.01325 BAR
Info: OSCILLATION PERIOD IS 200 FS
Info: DECAY TIME IS 100 FS
Info: PISTON TEMPERATURE IS 310 K
Info: PRESSURE CONTROL IS GROUP-BASED
Info: INITIAL STRAIN RATE IS 0 0 0
Info: CELL FLUCTUATION IS ISOTROPIC
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCE 1e-06
Info: PME EWALD COEFFICIENT 0.312341
Info: PME INTERPOLATION ORDER 4
Info: PME GRID DIMENSIONS 288 150 288
Info: PME MAXIMUM GRID SPACING 1
Info: Attempting to read FFTW data from
FFTW_NAMD_3.0alpha9_Linux-x86_64-multicore-CUDA.txt
Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
Info: Writing FFTW data to
FFTW_NAMD_3.0alpha9_Linux-x86_64-multicore-CUDA.txt
Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 2
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: RIGID BONDS TO HYDROGEN : ALL
Info: ERROR TOLERANCE : 1e-08
Info: MAX ITERATIONS : 100
Info: RIGID WATER USING SETTLE ALGORITHM
Info: RANDOM NUMBER SEED 1646117723
Info: USE HYDROGEN BONDS? NO
Info: Using AMBER format force field!
Info: AMBER PARM FILE
 /media/tue/Data/Data/test_mini/RCSB/../relax_pdb//calc/AF_1W_1WU_1WUZ_1_A/leap/AF_1W_1WU_1WUZ_1_A_neutral.prmtop
Info: COORDINATE PDB
/media/tue/Data/Data/test_mini/RCSB/../relax_pdb//calc/AF_1W_1WU_1WUZ_1_A/leap/AF_1W_1WU_1WUZ_1_A_neutral.pdb
Info: Exclusions will be read from PARM file!
Info: SCNB (VDW SCALING) 2
Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
Reading parm file
(/media/tue/Data/Data/test_mini/RCSB/../relax_pdb//calc/AF_1W_1WU_1WUZ_1_A/leap/AF_1W_1WU_1WUZ_1_A_neutral.prmtop)
..
PARM file in AMBER 7 format
Warning: Skipping ATOMIC_NUMBER in parm file while seeking MASS.
Warning: Skipping SCEE_SCALE_FACTOR in parm file while seeking SOLTY.
Warning: Skipping SCNB_SCALE_FACTOR in parm file while seeking SOLTY.
Warning: Found 485687 H-H bonds.
Info: SUMMARY OF PARAMETERS:
Info: 67 BONDS
Info: 153 ANGLES
Info: 198 DIHEDRAL
Info: 0 IMPROPER
Info: 0 CROSSTERM
Info: 0 VDW
Info: 153 VDW_PAIRS
Info: 0 NBTHOLE_PAIRS
Info: Reading pdb file
/media/tue/Data/Data/test_mini/RCSB/../relax_pdb//calc/AF_1W_1WU_1WUZ_1_A/leap/AF_1W_1WU_1WUZ_1_A_neutral.pdb
Info: TIME FOR READING PDB FILE: 0.900383
Info:
Info: LONG-RANGE LJ: APPLYING ANALYTICAL CORRECTIONS TO ENERGY AND PRESSURE
Info: LONG-RANGE LJ: AVERAGE A AND B COEFFICIENTS 574955 AND 581.291
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 1472897 ATOMS
Info: 1471677 BONDS
Info: 26517 ANGLES
Info: 65693 DIHEDRALS
Info: 0 IMPROPERS
Info: 0 CROSSTERMS
Info: 1536544 EXCLUSIONS
Info: 1464268 RIGID BONDS
Info: 2954423 DEGREES OF FREEDOM
Info: 494316 HYDROGEN GROUPS
Info: 4 ATOMS IN LARGEST HYDROGEN GROUP
Info: 494316 MIGRATION GROUPS
Info: 4 ATOMS IN LARGEST MIGRATION GROUP
Info: TOTAL MASS = 8.89287e+06 amu
Info: TOTAL CHARGE = -3.33139e-05 e
Info: MASS DENSITY = 1.25465 g/cm^3
Info: ATOM DENSITY = 0.125139 atoms/A^3
Info: *****************************
Info:
Info: Entering startup at 44.7864 s, 0 MB of memory in use
Info: Startup phase 0 took 0.000248679 s, 0 MB of memory in use
Info: ADDED 0 IMPLICIT EXCLUSIONS
Info: Startup phase 1 took 0.201876 s, 0 MB of memory in use
Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
Info: NONBONDED TABLE SIZE: 705 POINTS
Info: ABSOLUTE IMPRECISION IN FAST TABLE FORCE: 2.64698e-22 AT 9.94673
Info: RELATIVE IMPRECISION IN FAST TABLE FORCE: 5.64247e-16 AT 9.94673
Info: INCONSISTENCY IN FAST TABLE ENERGY VS FORCE: 0.000290479 AT 0.251946
Info: ABSOLUTE IMPRECISION IN SCOR TABLE FORCE: 2.11758e-22 AT 9.94673
Info: RELATIVE IMPRECISION IN SCOR TABLE FORCE: 5.86184e-16 AT 9.94673
Info: INCONSISTENCY IN SCOR TABLE ENERGY VS FORCE: 0.000178193 AT 9.97184
Info: ABSOLUTE IMPRECISION IN VDWA TABLE FORCE: 1.00974e-28 AT 9.99687
Info: INCONSISTENCY IN VDWA TABLE ENERGY VS FORCE: 0.0040507 AT 0.251946
Info: ABSOLUTE IMPRECISION IN VDWB TABLE FORCE: 6.2204e-22 AT 9.99687
Info: INCONSISTENCY IN VDWB TABLE ENERGY VS FORCE: 0.00150189 AT 0.251946
Info: Startup phase 2 took 0.00026995 s, 0 MB of memory in use
Info: Startup phase 3 took 1.1152e-05 s, 0 MB of memory in use
Info: Startup phase 4 took 0.00199203 s, 0 MB of memory in use
Info: Startup phase 5 took 1.6442e-05 s, 0 MB of memory in use
Info: PATCH GRID IS 14 (PERIODIC) BY 7 (PERIODIC) BY 15 (PERIODIC)
Info: PATCH GRID IS 1-AWAY BY 1-AWAY BY 1-AWAY
Info: REMOVING COM VELOCITY 0.000245969 -0.00383036 -0.00162446
Info: LARGEST PATCH (736) HAS 112867 ATOMS
Info: TORUS A SIZE 1 USING 0
Info: TORUS B SIZE 1 USING 0
Info: TORUS C SIZE 1 USING 0
Info: TORUS MINIMAL MESH SIZE IS 1 BY 1 BY 1
Info: Placed 100% of base nodes on same physical node as patch
Info: Startup phase 6 took 0.193281 s, 0 MB of memory in use
Info: Use 3D box decompostion in PME FFT.
Info: PME using 1 x 1 x 1 pencil grid for FFT and reciprocal sum.
Info: Startup phase 7 took 0.000113754 s, 0 MB of memory in use
Info: Updated CUDA force table with 4096 elements.
Info: Updated CUDA LJ table with 17 x 17 elements.
Info: Startup phase 8 took 0.0210923 s, 0 MB of memory in use
Info: Startup phase 9 took 3.051e-05 s, 0 MB of memory in use
Info: Startup phase 10 took 1.0641e-05 s, 0 MB of memory in use
Info: Startup phase 11 took 0.000820673 s, 0 MB of memory in use
LDB: Central LB being created...
Info: Startup phase 12 took 0.000622233 s, 0 MB of memory in use
Info: CREATING 30878 COMPUTE OBJECTS
Info: Found 348 unique exclusion lists needing 1216 bytes
Info: Startup phase 13 took 0.320426 s, 0 MB of memory in use
Info: Startup phase 14 took 4.9448e-05 s, 0 MB of memory in use
Info: Startup phase 15 took 0.00141796 s, 0 MB of memory in use
Info: Finished startup at 45.5287 s, 0 MB of memory in use

TCL: Minimizing for 100 steps
FATAL ERROR: CudaTileListKernel::buildTileLists, maximum shared memory
allocation exceeded. Too many atoms in a patch
[Partition 0][Node 0] End of program

This archive was generated by hypermail 2.1.6 : Tue Dec 13 2022 - 14:32:44 CST