Re: BUG: ReplicaUniformPatchGrid namd-2.14

From: Geist, Norman (norman.geist_at_uni-greifswald.de)
Date: Tue May 31 2022 - 07:36:47 CDT

Hey again,

after trying various things (the whole day) I was unable to resolve
what the problem with the checkpoints is. While there are no energy
spikes during the simulation, the DCD and restart files clearly
contain very close contacts (water-water and water-protein). Upon
restarting energies are basically inf. I now switched from
CheckPointStore/Load to ReplicaAtomSendRecv to exchange the states
between different hamiltonians to do the same thing and this solves
the issue.

I still want to stress that something is broken with the checkpoints,
though. Most likely a race condition in collecting the patches when
storing the checkpoints. It may be, that either patches get mixed
between replicas, or more likely, patches are outdated when being
merged.

Bests
Norman

Am Dienstag, den 31-05-2022 um 08:29 schrieb Geist, Norman:

Hey there, I've reported this before for a beta of NAMD-2.13 and the
problem is still present in 2.14:

It seems something is generally going wrong with
"ReplicaUniformPatchGrid". I’ve observed this with my own variant of

H-REMD where I used CheckpointStore and CheckpointLoad to swap the
coordinates between the Hamiltonians. This runs ok for the first
“job” but the restart files that are written in-between (as in
original REMD) using "output"

are broken as they contain overlapping water molecules. Not only that,
even the coordinates in the DCD files contain very close contacts of
below 1A for water-water and protein-water contacts.

 I’ve worked around it using just “output” and “reinitatoms”
for coordinate swapping, but the in-memory solution with global
checkpoints would of course be the cleaner solution, as accessive use
of the output command often overwhelms parallel filesystems such as
lustre or beegeefs.

 

For clarity, it was only a dihedral scaling H-REMD, so a scaling of
VDW interactions is not the reason for the overlapping waters, which
rather seem to be a problem with collecting the patches that are
probably mixed between replicas.

Any thoughts?

 

Norman Geist

This archive was generated by hypermail 2.1.6 : Tue Dec 13 2022 - 14:32:44 CST