From: Thomas C. Bishop (bishop_at_tulane.edu)
Date: Fri Sep 03 2010 - 13:18:17 CDT

Bogdan,
Thanks for the info!
My analysis takes microseconds and I want to do 10,000 files.
so I'm convinced the r/w ops not the analysis are taking my time

here are some further issues questions
1) using mkfifo to create the named pipe does not seem to work as I expected
(maybe my distro opensuse11.2 64bit 2.6.31.12-0.2-desktop or maybe just usage in tcl/vmd)
I would expect the following to work but it does not

IN VMD I'd like to do the following

mkfifo "fifo.pdb"
set sel [atomselect top all ]
$sel writepdb "fifo.pdb"

myanalysis "fifo.pdb"

quit

If I "tail -f fifo.pdb" in another shell. I see the "hello" appear but I can't get VMD to read this from file.

Am I missing something on how to use a named pipe?
it's not a file so the above commands have to somehow be piped to each other correct?

2) if I remember correctly once you mount a ramdisk you cannot release the ram by simply unmounting the ramdisk.
In whatever kernel I was using when I first tried this you had to reboot the machine and I thought that was too big a price to pay... maybe things have changed
I'll check into this.

TOm

On Thursday 02 September 2010 11:51:23 pm Bogdan Costescu wrote:
> On Thu, Sep 2, 2010 at 9:28 PM, Thomas C. Bishop <bishop_at_tulane.edu> wrote:
> > For purposes of analysis I often break a DCD into pdbs that are written to disk, analyze them with another program and remove the pdbs.
> > Can I avoid writing them to disk by using named pipes or sockets.
>
> If the analysis programs needs to only read once each PDB file, then
> what you describe is doable. With named pipes it is normally easier
> because you can replace file names with named pipes directly; with
> sockets most likely some small code changes are also necessary.
>
> > I've considered a ram disk as another option.
>
> This is a good option on Linux, especially in modern distributions
> where tmpfs is available.
>
> > To my surprise, I noticed that sometimes when I write a pdb, analyze it and remove it w/in a script
> > that there seems to be no actual disc activity. Every thing happens in buffers before anything is committed to disc.
>
> This depends on various OS settings like the type of file system these
> files are located on, settings for this file system, etc. But it also
> depends on your way of 'measuring' it: the HDD activity light is quite
> unreliable, there's no way to know in which conditions it lights; best
> is to use tools that look at what is actually written to disk - f.e.
> iostat.
>
> > Given this observation should I just let the OS handle things rather than force the issue w/ pipes/sockets?
>
> If you are doing lots of such operations, so the potential for speed
> increase exists (both of these are subjective of course ;-)), then you
> could at least try once for your peace of mind :-)
>
> If the workflow includes writing a small PDB file and then waiting
> seconds or more for the analysis of each PDB, then don't bother - the
> I/O time is most likely just a negligible fraction of the total time
> and you won't see any significant improvement.
>
> > What if any chances are there for data corruption if I pass data via pdbs in this manner.
>
> Using tmpfs won't change the workflow in any way - the temporary files
> will be created in memory instead of being written to disk, but this
> is totally transparent for either VMD or the analysis program; of
> course if the computer crashes just at that time, you won't have a
> copy of the last PDB file written... but you need to restart your
> analysis anyway so it probably doesn't matter.
>
> With named pipes and sockets, the data cannot be corrupted in the
> sense of being modified in random ways - but what can happen is
> getting incomplete data in various corner cases (f.e. socket is closed
> forcefully without waiting for conformation from the other end), but
> very likely you won't have to deal with them. To prevent the
> pipe/socket buffer from becoming full and stalling the process writing
> to it, you should start the reader (the analysis process, if my guess
> is right) before starting to write.
>
> Bogdan
>
>

*******************************
   Thomas C. Bishop
    Tel: 504-862-3370
    Fax: 504-862-8392
http://dna.ccs.tulane.edu
********************************