Re: MMPBSA Question

From: Rudy Richardson (rjrich_at_umich.edu)
Date: Thu Sep 22 2022 - 16:36:11 CDT

Hi Kelly,

I agree with Josh. One approach for dealing with autocorrelation in
time series data such as MD trajectories is to use block averaging:

Flyvbjerg, H., and Petersen, H.G. (1989). Error estimates on averages
of correlated data. J. Chem. Phys. 91, 461-466. doi: 10.1063/1.457480.

There are functions for this in R:

Huais, P.Y. (2022). R-function: block_average.
https://urldefense.com/v3/__https://github.com/phuais/block_average__;!!DZ3fjg!-9yWk-1p1Kcp_qjgiQLaiOtdLDFSbdV6qQyH3JrSZnVBAXivXdtd5ZV0N_cXoghOMaivBkV_OWLfaNo9ylA$

and I think in Python and Matlab as well. A plot of SEM vs block size
fits an exponential function with a plateau -- the plateau value gives
the optimal block size. In a recent MD simulation that I did, I found
that I needed a rather large block size of 10 to 20 ns, depending on
the parameter being assessed.

Best regards,

Rudy

Rudy J. Richardson, ScD, DABT
Molecular Simulations Laboratory
Room M6065 SPH-II 2029
University of Michigan
1415 Washington Heights
Ann Arbor, MI 48109-2029 USA

Email: rjrich_at_umich.edu

On Thu, Sep 22, 2022 at 5:00 PM Mcguire, Kelly <klmcguire_at_ucsd.edu> wrote:
>
> Thanks for the response. I am using a 10 ps decorrelation time which was recommended in a couple of publications. I could bump that up.
>
> Sent via the Samsung Galaxy S22 5G, an AT&T 5G smartphone
> Get Outlook for Android
> ________________________________
> From: Josh Vermaas <vermaasj_at_msu.edu>
> Sent: Thursday, September 22, 2022 1:15:36 PM
> To: namd-l_at_ks.uiuc.edu <namd-l_at_ks.uiuc.edu>; Mcguire, Kelly <klmcguire_at_UCSD.EDU>
> Subject: Re: namd-l: MMPBSA Question
>
> Hi Kelly,
>
> Would you expect your standard deviation to go down with more frames/data points? The standard deviation measures in some general sense how far away a typical measurement is away from the mean of a measurement. If you took 1000 samples from a gaussian distribution, the standard deviation would be basically the same as if you took 1,000,000 samples. The standard error of the mean does decrease with increasing number of samples.
>
> The better question I think is how many actually independent samples you have in your 5000 DCD frames. CaFE is assuming something like a ~100ps decorrelation time between frames. Is that enough? No idea, but technically you want independent samples when calculating a standard error.
>
> -Josh
>
> On 9/22/22 1:56 PM, Mcguire, Kelly wrote:
>
> Question about standard deviations with MMPBSA. Should I use millions of snapshots for better standard deviations. I ran a 10 nanosecond simulation of my protein-peptide complex (2 fs timesteps, saved to dcd every 1000 steps). That would be 5,000 frames total in the DCD. I used CaFE and NAMD to get my MMPBSA result (single trajectory method). For CaFE I used stride 5 for the trajectory so that I would use a frame every 10 ps, thus a total of 1,000 frames for the MMPBSA calculation. My result is in the attached file. I get -29.3 kcal/mol with a standard deviation of 11.5. That's a large standard deviation. But, my standard error seems decent (11.5/sqrt(1,000) = 0.37), which appears to be what others report for these calculations instead of the standard deviation. So is this a good result for a single trajectory only using 1,000 frames for the calculation, or do I really still need to use millions of frames? Thanks!
>
>
>
>
> Dr. Kelly McGuire
> Herzik Lab - Postdoc
> Chemistry/Biochemistry Department
> Natural Science Building, 4104A, 4106A, 4017
>
>
> --
> Josh Vermaas
>
> vermaasj_at_msu.edu
> Assistant Professor, Plant Research Laboratory and Biochemistry and Molecular Biology
> Michigan State University
> vermaaslab.github.io

This archive was generated by hypermail 2.1.6 : Tue Dec 13 2022 - 14:32:44 CST