VMD 1.9.1 Development
VMD Development Status
- VMD 1.9.1 Final Release (February 1, 2012)
- remote: added specific doc example of setting the viewmaster key bindings.
- README: Updated list of changes to molfile plugins in VMD 1.9.1
- Added FFTK docs to the plugin doc index page
- fftk: Added FFTK plugin documentation
- README: Added notes about recently added and updated VMD plugins
- Null out atom selection parser nodes after destruction to prevent any possibility of cyclic or multiple deletion from occuring.
- psfplugin: Added comment regarding bond record generation for EXT formatted PSF files
- VMDMobile: update info on version 2. updated numbers for next version. added instructions about how to do an update to the android market
- Added environment variable based override to bypass the old PDB-centric atom name mangling convention, thereby allowing atom names to be restricted in length only by the molfile plugin API atom fields, and not by the old PDB-friendly name mangling routine. If the environment variable VMDNOMANGLEATOMNAMES is set, the name mangling code is completely bypassed, and up to 15-character atom names can be used freely.
- reverted EXT BOND tweak, until we can verify more carefully
- psfplugin: Ensure that we still test for the need for the NAMD PSF format variant even when there are large atom counts.
- psfplugin: Added more logic to the PSF plugin to determine cases where the CHARMM "EXTended" PSF format cannot accomodate long atom types, and we add a "NAMD" keyword to the PSF file flags line at the top of the file. Upon reading, if we detect the "NAMD" flag there, we know that it is possible to parse the file correctly using a simple space-delimited scanf() format string, and we use that strategy rather than holding to the inflexible column-based fields that are a necessity for compatibility with CHARMM, CNS, X-PLOR, and other formats. NAMD and the psfgen plugin already assume this sort of space-delimited formatting, but that's because they aren't expected to parse the PSF variants associated with the other programs. For the VMD PSF plugin, having the "NAMD" tag in the flags line makes it absolutely clear that we're dealing with a NAMD-specific file so we can take the same approach.
- Added Bourne shell CVS alias syntax to docs
- updated heatmapper plugin help URL
- updated rmsdvt plugin help URL
- networkview: added an image of the plugin window and beefed up the docs based on the tutorial.
- plugins: Added links to externally hosted vmdICE and SurfVol VMD plugins.
- remote: Eliminated duplicate html heading since pages in the plugin docs get them autogenerated anyway.
- Updated plugin index page with docs for recently added plugins
- rmsdvt: Bug fix from Anssi to eliminate multiplot warnings from empty columns
- Bug fix from Anssi to eliminate multiplot warnings from empty columns
- added the rmsdvt and heatmapper plugins to the extensions menu
- remote: corrected the destination directory for the remote plugin files.
- Added heatmapper plugin contributed by Anssi Nurminen, anssi.nurminen at tut.fi
- Added rmsdvt plugin contributed by Anssi Nurminen, anssi.nurminen at tut.fi
- remote: describe the new functionality for the command line interface as well
- psfplugin: Updated PSF EXTended format output checks to take the total atom count into consideration, and to eliminate the test on atom type length, since that field width is still 4 for PSF files, despite being 8 for CRD files.
- remote: tweaked docs to say that you can choose the user in control.
- VMDMobile: added a new 'mobile set activeClient' command that can be used to determine who, of the currently connected clients, has mobile control of the running VMD session.
- remote: In the listbox of currently connected users, the active one is now highlighted. Additionally, if the user clicks on a name, it now sends a request off to VMD to activate that user as the active guest, and turns off others.
- propka: Updated to the latest version of Michal'l propka plugin
- qmtool: Updated qmtool routines to use synchronous file I/O operations so that files are not still held open when called from the fftk plugin.
- psfplugin: Remove extra argument at end of printf.
- VMDMobile: added in a check to warn people if they don't have wi-fi turned on. This caused an additional permission to be required (to check the state of the network).
- VMDMobile: When dragging a single finger to rotate or scrub, a gray circle is now left on the screen at your starting point, so that you can see where you are at relative to the zero position
- Imported the latest version of the FFTK plugin
- remote: doc updates to add pics, fix some wording to make it match the current state
- fixed a bug that was occuring because the GUI wasn't getting notified if a socket couldn't be opened. Now it is being notified.
- viewmaster: Cranked viewmaster package version due to recent changes.
- viewmaster: added procedures to get the next and previous IDs, relative to the current slide that is being shown. Used that to make a couple of procedures that actually jump to the next and previous slides, and added a convenience method that sets key Aux-0 to go to the previous slide, Aux-1 to go the next, and Aux-2 to save the current view to a new slide.
- VMDMobile: added debug method
- VMDMobile: unimportant change to improve currently-commented out debug statements
- fixed a code path bug where rotation wasn't occuring if we weren't getting an active packet from the mobile device. Everything seems to silky smooth now.
- removed an unnecessary colspan on the tcl window and have the hostname line there, which is currently commented out until we can find a more robust way to display.
- remote: added a note to the docs about firewall rules potentially being an issue in the plugin
- qmtool: added catch to file deletes
- Cranked version
- VMD 1.9.1 beta 2 (January 25, 2012)
- convert mobile device interface button 0 to regular, and add a 4th
- remote: Android app version # updates
- remote: tweaks to text describing preferred use of the market and cleanup of text for non-Market users.
- remote: text used for app entry at market.android.com/publish
- remote: latest version of the APK, and updates to the docs to point to the Android Market
- remote: updates to latest code
- remote: 4 buttons, all named Aux-X now exist.
- remote: added 512 x 512 pixel image required for the android market
- remote: fixed a bug where [namespace current] was returning the empty string
- fftk: Updated fftk GUI initialization and QMTool reset calls.
- remote: Screenshots, headed for the Android Market
- remote: revised to be up to date with the current state of the app and VMD support.
- remote: changed default machine ip to make it more obvious to people that they need to set it.
- remote: default nick is now "Anonymous User"
- remote: touchpad background image is now optional, set via the Settings menu
- added code to pass mobile device button presses through as Aux-X events. Pressing the buttons on the mobile device now triggers Aux event calls.
- remote: added a Kindle Fire specific icon in a specific directory that the Fire needs
- remote: updated README with more dev tips
- fftk: Added the new Force Field Toolkit plugin to the VMD Extensions menu.
- fftk: Added new Force Field Toolkit plugin by Chris Mayne and JC Gumbart.
- gamessplugin: let user know that Firefly's XMCQDPT2 results are not supported
- paratool: Updated paratool minor version due to recent bug fixes that were originally for use by the new fftk plugin.
- gamessplugin: Cranked major and minor versions of gamessplugin due to recent updates and since it is mostly stable now.
- qmtool: cranked qmtool package version due to recent updates.
- psfplugin: Updated PSF plugin to write PSF files using the CHARMM EXTended format when necessary.
- qmtool: fixed improper function call
- remote: fixed a bug with the background image on the Kindle Fire
- remote: revised instructions to tell how to turn on sideloading on the Kindle Fire.
- remote: tweaked instructions on interacting with the touch screen
- remote: updated release version
- remote: background image for touch area
- remote: moved logging to a preference option. Can now turn it on and off.
- remote: adjusted translation/rotation scaling back to 0.5, 0.005.
- remote: added background image to touch area.
- gamessplugin: fixed reading of MCSCF orbital coefficient for Firefly
- gamessplugin: Removed debug message.
- gamessplugin: Fixed reading of CONTRL OPTIONS and BASIS SET for Firefly/pc-gamess output files
- remote: added in a null check to prevent an exception for new users that was appearing for an instant and going away.
- rnaview: Enable compilation of rnaview on Windows
- topotools: update copyright date and TODO list.
- topotools: use resid for the molecule id on writing, since this is what we read in. Corrected output format for hybrid atom style.
- multiseq: revised MAFFT code to do the same thing as clustal code when passing in unknown residues
- multiseq: revised unknown residue conversion code to use X for unknown residues if amino acids; N if nucleic.
- psfgen: Updated psfgen version number to 1.6 with recent changes. Fixed out-of-date psfgen makefile that hadn't been updated since version 1.4.
- psfgen: Made the psfgen segment generation output slightly more compact. Added a new console output routine that lets the caller decide whether to prepend the "psfgen) " and whether to include a newline or not. This cuts the amount of console output tremendously, which is very helpful for logs from solvate and similar automated structure building tools.
- psfgen: Cause psfgen to prepend "psfgen) " to all of its console output. This should reduce user and developer confusion when users encounter errors in their own scripts or when using structure building plugins and asking for help from developers or from other users.
- solvate: Modified solvate to print estimated completion times at most only once every 30 seconds rather than during generation of every replica. This saves a lot of console output and/or text log file space.
- solvate: Updated the solvate plugin to better handle generation of very large solvent boxes. We now check the maximum segment name length that will result from the required number of replicas, and use either decimal, hexadecimal, or base-26 alphanumeric segment naming schemes depending on which one will most easily fit into the 4-character field width limit of the oldest PSF/PDB file formats. Improved the diagnostic messages.
- multiseq: fixed a bug where part of the code for 'calculating non-redundant sets with only marked sequence' was using all of the sequences, and not just the marked one.
- multiseq: made MultiSeq a bit smarter when doing profile/sequence alignments based on a sequence that has unknown residues in it. Before, when feeding that to clustalw it was failing due to clustal eliminating unknown residues. Now they are replaced by residue X, which translates to a 'generic amino acid'
- runante: Fixed bug when checking for Antechamber executable
- runsqm: Fixed bug in ::runsqm::sqminit when checking for sqm executable
- added info about release key location (which is NOT stored in CVS)
- remote: updated Android package to the updated to latest rev (much smaller)
- remote: replaced the debug version with the release version of the APK so we start getting the one out there with the official crypto key
- remote: removed a bunch of (currently) unused images that were making the compiled Android package larger than needed.
- remote: removed default athine.ks binding for the IP
- remote: Updated Android remote control app to the latest version.
- remote: added icon, removed various debugging items that don't need to be there for a release to the android market
- Revised descriptions of the VdW and Points representations in the User's Guide, to take into account new GLSL rendering features and control behaviors with different rendering modes and external renderers.
- Added documentation for the new QuickSurf representation in the VMD User's Guide.
- Updated all of the VMD User's Guide sections to replace the old "forms" terminology with "window". The old "forms" phrasing dates back to the original versions of VMD that were based on the Forms library for SGIs, and later to the XForms compatibility library. Since FLTK and other more modern GUI toolkits have dropped that terminology in favor of modern convention, we have too.
- cionize: Updated cionize Makefile for latest CUDA and Intel C/C++ compilers for the iccgpuionize64 target.
- qmtool: fixed reading of ESP charges for later versions of Gaussian
- Updated the CUDA and OpenCL device stat printing routines to be somewhat more compact, particularly for GPUs or other devices that contain more than 10GB of on-board RAM. Increased the maximum device string length for OpenCL since the MacOS X OpenCL devices for Intel x86 CPUs use a very long name string.
- Minor correction to display of OpenCL device hardware properties.
- rmsdtt: Added missing doc image
- optimization: fixed bug with annealing and boundaries
- multiplot: fixed a small bug and added a "clear" option
- Cranked version
- VMD 1.9.1 beta 1 (December 29, 2011)
- Turn on CUDA for 64-bit MacOS X again.
- Many README updates
- autopsf: Added feature requests to TODO comment section, submitted by users.
- Added floating point assignments to msmsplugin and stlplugin to prevent aggressive MSVC compiler optimizations from leaving out linkage to the floating point library.
- Updated NMWiz plugin by Ahmet Bakan
- Updated Win32 registry query routines for VMD 1.9.1
- remote: Added the remote package to the plugin distribution
- timeline: Fixed horz. label display spacing for large numbers of frames. Fixed overlapping data rects on same original pixel (was major, previously undetected problem, when zoomed on trajectories with large frame counts) Allowed for mouse jitter (2 pixels) when single click for zoom-out to eliminate the unintended zoom-ins when trying to zoom out. Now deals properly with a no-results from a calculation (for example, for salt bridge in alanin.dcd). Added needed checks at marquee draws, elsewhere. Will properly redraw the analysis method name at color scale and selection info window. Adds a "[NO RESULTS]" message to selection info window when no results-- on-screen only, postscript printout ends with dialog box if no data.
- nmwiz: Updated to latest version of NMWiz plugin
- Cranked version
- VMD 1.9.1a18 (December 23, 2011)
- Updated VS2005 project file with new CUDADispCmds source file, and updated the VS2005 preprocessor defines for all CUDA source files.
- Avoid MSVC compilation issues for CUDA sources, since they don't have their include paths set identically to the other source files.
- propka: Updated to beta 6 version of Michal's propka plugin.
- dtrplugin: Applied a series of small portability fixes for the DESRES dtrplugin to allow it to compile for MSVC etc.
- dmsplugin: Updated DESRES dmsplugin with their latest version.
- removed old C version of dmsplugin now that the C++ version is added to the CVS tree and the revision history has been copied over.
- Reworked the makefile since the DESRES dmsplugin is now a C++ source file rather than plain C.
- maeffplugin: Updated DESRES maeffplugin with latest version.
- dtrplugin: Updated the DESRES dtrplugin with their latest version.
- put the current app in the doc directory and pointed the html to it
- remote: Added VMD GUI plugin for Android phone/tablet remote control app
- Added Android phone/tablet VMD client to the plugin tree.
- Added callbacks, code revisions, etc necessary to provide internal VMD support to the mobile device remote control plugin.
- mol2plugin: modified write_mol2_timestep() to align '.' in charge column
- molefacture: fixed atom coloring bug
- Updated OFF plugin with the latest version from Francois-Xavier Coudert
- hoomdplugin: Support writing resid as body tag to match reading behavior. Fix off-by-one error when writing the 4th dihedral atom. Prepare parser for upcoming orientation tag (currently ignored).
- Don't clear out angle/dihedral/improper typename list when reading structure from a molfile plugin, but only when doing a total reset through Tcl script code.
- Moved the loop that preprocesses atom coordinates and radii down into the code right before the host-to-GPU copy, thereby simplifying error cleanup and getting ready to replace the CPU code with a new CUDA kernel that takes care of this for itself.
- Moved the loop that preprocesses atom coordinates and radii down into the code right before the host-to-GPU copy, thereby simplifying error cleanup and getting ready to replace the CPU code with a new CUDA kernel that takes care of this for itself.
- Added a fast QuickSurf code path that queries the mininum and maximum atom radii for the whole molecule and we only compute the min/max radii for the actual group of selected atoms if there is a wide range of radius values for the whole molecule, otherwise the performance impact is assumed to be negligible since any given selection is likely to span most of the range of radii found in the entire molecule.
- Fix CUDA vertex buffer allocations to be float3 now rather than float4 since we fixed the marching cubes code some time ago already.
- When the user manipulates the atom radii via atom selections, we must call BaseMolecule::set_radii_changed() so that min/max are recalculated.
- Call BaseMolecule::set_radii_changed() upon loading or adding any files that have their MOLFILE_RADIUS optflag set.
- Added BaseMolecule methods for querying min/max atom radii and forcing updates.
- topotools: teach readlammpsdata to ignore class2 force field coefficients don't output topology information that is incompatible with atom style
- topotools: change topogromacs atom numbering scheme for better compatibility with grompp.
- Added FileRenderer export base class methods to support the new triangle mesh display commands.
- Implemented both per-vertex and non-per-vertex coloring versions of the new triangle mesh primitives, for better performance on QuickSurf trajectory animations.
- Further compaction of QuickSurf diagnostic mesgs.
- condensed QuickSurf diagnostic output
- Eliminate a few info messages from CUDAQuickSurf::calc_surf()
- silenced old debugging output from DrawMolItemQuickSurf
- Corrected vertex indexing for DispCmdTriMesh::getpointers()
- Revised the QuickSurf and CUDAQuckSurf objects to enable them to directly access the VMDDisplayList.
- Minor cleanup of the OpenGL rendering code for non-indexed triangle meshes
- Various corrections and improvements for the non-indexed triangle mesh display command primitive, both on the host and for CUDA.
- Added CUDADispCmds to the build
- Added a CUDA-specific variation of the triangle mesh display command to allow the triangle mesh to be transferred directly from device memory into the display list.
- Eliminated unused display command objects.
- Use a forward declaration for CUDAQuickSurf class. Since the member is a pointer, the CUDAQuickSurf class doesn't have to be fully specified and thus we are keep all of the CUDA-specific types compartmentalized.
- Updated the triangle mesh method name to match the others.
- Added a new ::draw_trimesh() method to the QuickSurf class so that the CUDAQuickSurf implementation can directly manipulate the display list since we need to be able to handle OpenGL rendering straight from device memory.
- Added initial support for the new DTRIMESH_C3F_N3F_V3F display command which will be used for the CUDA QuickSurf rep.
- Added scene export support for the new non-indexed triangle mesh format.
- Added a new display command for non-indexed triangle meshes for use by QuickSurf when rendering straight from GPU memory
- Completed the first running implementation of QuickSurf that uses a persistent QuickSurf object for faster trajectory animation. The QuickSurf object contains a CUDAQuickSurf object, which in tun contains a CUDAMarchingCubes object, so the use of a persistent object to service all QuickSurf rep generation operations gives a significant performance boost, particularly for rendering smaller structures where the repeated setup/init/calc/teardown times would otherwise contain significant overhead.
- Added a persistent QuickSurf object to the DrawMolItem class, so it can be reused repeatedly.
- More rewriting of the QuickSurf class to allow the object to be reused repeatedly.
- Added code to ensure that when we reuse an existing QuickSurf object, the various transient memory buffers used by the host CPU are cleared and/or reallocated as needed.
- Made the CUDA marching cubes memory allocation estimation routine static so it can be called without having an existing MC object. Added an max gridsize query to allow the caller to check if an existing MC object can process a grid or if it has to be destroyed and recreated.
- Cranked version
- VMD 1.9.1a17 (December 3, 2011)
- Improved behaviour of the QuickSurf resolution slider for coarse resolutions.
- Added header file for refactored version of CUDA QuickSurf code.
- More updates to Visual Studio 2005 project files
- namdenergy: Cranked version of the NAMDenergy plugin due to recent updates.
- namdenergy: Fixed the behavior of the "-updatesel" flag, by passing the $currentMol parameter to the call to "animate write".
- namdenergy: Applied Chris Mayne's improvements to namdenergy that improve the efficacy of the silence option. Cleaned up tab chars and various other cruft while doing a quick audit of the new version.
- Continued refactoring the QuickSurf and CUDAQuickSurf classes to allow the objects to be reused multiple times, thereby eliminating various sources of overhead that can be amortized across multiple surface calculations, particularly in the case of the CUDA implementation where there's a relatively high cost associated with repeated memory allocation/deallocation operations.
- networkview: added a link to the tutorial that describes usage of the NetworkView plugin
- Started refactoring the QuickSurf class to allow a QuickSurf object to be reused multiple times, thereby eliminating various sources of overhead that could be amortized across multiple calculations, particularly in the case of the CUDA implementation where there's a relatively high cost associated with repeated memory allocation/deallocation operations.
- multiplot: Further improvements to multiplot "-callback" implementation, to associate different callback functions with different datasets in the same multiplot, which should avoid unintended triggering of callbacks and improve the flexibility and behavior of the feature in general.
- nmwiz: Further NMWiz tweak for coloring.
- nmwiz: Updated to latest version of Ahmet Bakan's NMWiz plugin, which improves the usage of exectool to automatically find python and other packages, and updates it to use the new multiplot "-callback" flag rather than the "-nmwiz" flag we used in the first implementation.
- multiplot: Changed the multiplot plugin to use "-callback" rather than "-nmwiz" for user-defined callback procs to be triggered upon picking displayed data points.
- Updated the QuickSurf graphical representation GUI callback functions to implement the same type of half-resolution-while-dragging behavior that has previously been implemented for the Isosurface and Orbital representations.
- Cranked version
- VMD 1.9.1a16 (November 30, 2011)
- Added running implementation of per-residue-bead generation logic for QuickSurf rep.
- Added more code to support residue-based beads for generation of coarse-grained QuickSurf reps of very large structures.
- Eliminated boilerplate QuickSurf rep callback code
- Added some prototype residue bead generation loops for QuickSurf, but needs much more to make it fully functional.
- Use a larger default marching cubes vertex buffer, allocating space for one triangle per voxel by default.
- Eliminated chatty info messages from CUDA QuickSurf, and added an error/warn message when the maximum number of vertices has been hit by the marching cubes code.
- Updated CUDAMarchingCubes::MemUsageMC() to better account for the memory used by the 3-D texture map and by internal allocations by thrust.
- Updated the CUDA QuickSurf implementation to call CUDAMarchingCubes::MemUsageMC() to compute the memory allocation footprint of the marching cubes phases of the algorithm. Changed the error handling strategy so that we check for CUDA errors right before mass-deallocation of working buffers so that resources are returned even if unrecoverable errors happened during the surface calculation. We now return errors to the caller only after all of the CUDA device memory deallocations have been completed. Removed a few bits of debugging code from earlier implementations.
- Updated the CUDAMarchingCubes::MemUsageMC() method to account for memory required by thrust scan calls, and copies resulting from creation of a CUDA 3-D texture
- Commented out unused scan calls to save compile time for their associated tree of templates.
- Redesigned the CUDAMarchingCubes class to better support the multi-pass QuickSurf algorithm, greatly reducing the number of memory allocation/deallocation operations performed for each pass. This will also allow the CUDA QuickSurf code to be further refactored so that a persistent CUDA QuickSurf object can be instantiated and re-used over and over for fast trajectory rendering. The contained CUDAMarchingCubes object setup/teardown will be amortized as well since it will exist for as long as the CUDA QuickSurf object does.
- Eliminated use of the "uint" type in the CUDA QuickSurf and marching cubes codes since it is not very portable. We still use the uint2/uint3 types. Since they are defined in the CUDA headers we can expect them to be available everywhere we'd try to compile our code.
- Misc cleanup of CUDA marching cubes code.
- Cranked version
- VMD 1.9.1a15 (November 21, 2011)
- Rewrote CUDA QuickSurf implementation to use the new CUDA Marching Cubes isosurface extraction code rather than using the host code. Much cleanup remains, but the code is pretty stable now.
- Added the new CUDA Marching Cubes implementation to the build.
- Changed subvolume indexing arithmetic to operate in terms of input map dimensions rather than voxels.
- Added code to the CUDA marching cubes implementation to allow isosurface extraction for a caller-specified sub-volume. This is useful in general, but is needed in-particular for use by the QuickSurf rep. When the QuickSurf density map is too large to fit in GPU memory, we must compute it in multiple passes by computing the density map and the associated isosurface for sub-regions of the entire volume, looping until we have processed the entire volume. In order to get correct surface normals at the two boundary planes, we must have two planes of overlap with the previous density map chunk, which requires that we extract a density map region that extends from the second plane to the next to last plane.
- Added code to the CUDA marching cubes algorithm to check the compute capability of the target device and launch either a 2-D or 3-D CUDA grid depending on the device. Merged the 2-D and 3-D versions of classifyVoxel() using a C++ template trick that evaluates at compile-time, thereby optimizing out the code that tests whether to use 2-D or 3-D thread indexing logic.
- Added open source CUDA marching cubes implementation written by Michael Krone and John Stone.
- Use zero-cleared memory for various strings to reduce spurious uninitialized value complaints from valgrind when compiling with the Intel C++ compiler with its SSE optimized string functions.
- Moved temporary string buffer into Inform class, to make it easier to eliminate spurious valgrind uninitialized read warnings when compiling with the latest Intel C++ compiler. The Intel compiler doesn't initialize stack allocated array variables to zero, so valgrind goes berserk finding various uninitialized reads when SSE optimizations are enabled, since they touch data past the end of strings.
- Tweak to improve stringtoupper()
- Updated the marching cubes lookup tables with a new numVertsTable used by the GPU marching cubes implementation. The CPU code needs edgeTable and triTable. The GPU code needs triTable and numVertsTable. The GPU code has no use for the edgeTable because it computes all 12 edge vertices regardless whether or not they will be used, due to SIMD branch divergence.
- added NetworkView to the analysis extension menu
- networkview: added networkview plugin from the Luthey-Schulten group
- Cranked version
- VMD 1.9.1a14 (November 11, 2011)
- rmsdtt: Updated RMSD Trajectory Tool GUI to report 4 digits for molecule IDs.
- rmsdtt: Updated to version 3.0 of Luis Gracia's RMSDTT plugin.
- changed default touchpad event to 'non touch' and let the fct change it as needed. changed reset view button #.
- Renamed FileRenderer class trimesh() virtual method to trimesh_c4n3v3() in preparation for adding new triangle mesh primitives using CUDA- and OpenCL-friendly interleaved vertex array packing formats.
- multiseq: localized a global variable (w) that was needed for printing error messages. Should fix bug reported by Patricio Oyarzun.
- multiseq: added waitfor's on structure loads, fixed a typo in a MAFFT name that was causing an erroneous error message. Still an underlying bug, though, that is being investigated.
- Renamed DTRIMESH display token to DTRIMESH_C4F_N3F_V3F to clearly indicate the particular interleaved vertex format it uses. We are about to add one or more new triangle mesh primitives that are highly optimized for the specific cases needed for the new QuickSurf representation. The existing DTRIMESH_C4F_N3F_V3F primitive uses one of the built-in OpenGL interleaved vertex array packing formats, but this interleaved vertex format is inefficient for use within CUDA or OpenCL kernels, so we will need to create one or two new triangle mesh formats that are specifically optimized for use by CUDA and OpenCL kernels.
- Further simplification of the CUDA QuickSurf loops in prep for integration of the marching cubes calls.
- Wrote a greatly streamlined and simplified variation of the density chunk processing loop so that when we pull in the CUDA-accelerated marching cubes code the code doesn't get unmanagably complicated.
- Added alternative loop for faster trimesh display command data packing
- Added timer for display command processing since this has been a source of significant overhead for the QuickSurf rep in initial testing...
- Added in a nickname that is being sent to identify each device
- Tweaks to get heartbeat working better, and added in functionality for a button that will reset the scene view
- Cranked version
- VMD 1.9.1a13 (November 4, 2011)
- Refactored the QuickSurf class to accomodate the new CUDA-accelerated marching cubes implementation. Since we want to keep the density map data on the GPU throughout the entire calculation, it is much simpler to move the marching cubes isosurface extraction steps into the QuickSurf class rather than having this done by DrawMolItem::draw_quicksurf(). The QuickSurf class now produces a vertex array as its final product, so the density and texture maps are now hidden internal objects that the caller never sees. This greatly simplifies the modifications necessary to allow the CUDA implementation to keep all of the data on the GPU throughout the computation.
- Absorbed the body of DrawMolItem::draw_volume_isosurface_trimesh() into DrawMolItem::draw_quicksurf() to enable us to use the new GPU-based marching cubes implementation rather than the host code, so that we can eliminate copying the volumetric data from the GPU back to the host. Soon only the resulting triangle mesh will need to be transferred from the GPU back to the host since all of the other algorithm steps take place entirely on the GPU.
- Made CPU marching cubes triangle facet table static to prevent linkage conflicts with CUDA marching cubes triangle tables which contain the same information, but encoded differently.
- Replaced CUDA QuickSurf calls to __expf() with pre-scaled coefficients and use of exp2f(), for a nice little 6% speed gain.
- Updated to version 7 of the phone/tablet network API. Not currently doing anything useful with the heartbeat or the button events, but the code is, at least, working with the current version of the device app.
- Updated NMWiz docs with latest version.
- vmdmovie: misc cleanup of movie duration callback handler
- Disable safety check that throttles Linux-based ATI/AMD graphics to avoid problems with the drivers. Current ATI/AMD graphics drivers are greatly improved, so it should no longer be necessary to disable the advanced OpenGL APIs on these drivers.
- Updated NMWiz plugin with latest version, which completely eliminates the duplication of the multiplot functionality, using the new callback mechanism instead of maintaining a separate instance of the multiplot code.
- Added initial NMWiz docs
- Applied an initial multiplot modification to allow Ahmet Bakan's NMWiz plugin to receive callbacks when multiplot points are selected. The initial implementation isn't as general is it should be yet, but it should eliminate the need for NMWiz to contain its own modified version of multiplot, and with a small amount of work we can make this fully general.
- Added Ahmet Bakan's "NMWiz" normal mode wizard plugin to the builds.
- Updated to version 5 of the propka GUI
- Updated Win32 compilation flags to fully enable the new QuickSurf rep.
- VMD 1.9.1a12 (October 24, 2011)
- Added QuickSurf safety checks for cases with more than 16 million atoms, so we use a larger thread block size to prevent exceeding the 65535-block grid dimension limits in CUDA 4.x. Added safety code to limit the total number of thread blocks in a single kernel launch to prevent us from running longer than the display GPU kernel timeout limit on Linux and Windows.
- Fix small array indexing buglet in the new QuickSurf atom packing code.
- Reorganized all of the QuickSurf implementations to use packed 4-component vectors for coordinates and radii (xyzr) and colors, since this is the data format that is natively used by the fast GPU-accelerated CUDA code path. By making both the host and GPU code paths use the same data format redundant packing code is eliminated, and we eliminate per-frame overhead by spending more effort on a single highly-optimized atom selection traversal and data packing loop. The GPU code path should perform all of the per-atom preprocessing for itself still by launching a custom kernel, but that isn't implemented yet.
- Allow a larger maximum QuickSurf isovalue for making highly smoothed reps
- Ensure that the QuickSurf acceleration grid spacing is never finer than the density grid spacing.
- Reduced selection/atom/radius/color preprocessing overhead for the fast GPU-accelerated QuickSurf code path.
- Significantly reduced overhead for array packing by pre-sizing the destination ResizeArray buffers.
- Misc cleanup and optimization of the QuickSurf atom preprocessing loops since they are going to end up being very important for interactive rendering performance for the fully GPU-accelerated code path.
- Prevent timers from being leaked when we have to early-exit during various error handling scenarios
- The top level QuickSurf code now passes in the maxrad parameter rather than requiring the CUDA implementation to compute this redundantly for itself
- Eliminated independent gaussian radius parameter from the density kernels since it should always match the acceleration grid spacing parameter.
- Simplified the CUDA QuickSurf code that handles computation of grid sizes and looping over kernel launches depending on whether the target GPU is SM 1.x or SM 2.x.
- Updated the CUDA QuickSurf implementation with basic support for older GT200 (SM 1.3) GPU hardware, by looping over planar grids of thread blocks. Performance is fairly good since GT200 devices have very close to the same number of special function units (which run the __expf() calls) as are found on the current Fermi GPU hardware. The current implementation will reject anything below compute capability 1.3. We depend on hardware broadcasts of global memory reads to all threads accessing the same data element in the innermost loop over atoms. Devices with SM 1.3 and 2.x all support hardware broadcasts for the access pattern we use.
- Reorder CUDA QuickSurf memory deallocation and chunk size and thickness threshold tests so that no leaks can occur if the minimum slab thickness is reached.
- Rearchitected the CUDA QuickSurf top level driver code to more gracefully handle memory allocation failures. The new code loops attempting to allocate required memory buffers, halving the size of the GPU-resident working buffers in each pass until we find a subvolume size that fits on the GPU, or we reach a lower-bound and trigger CPU fallback instead. The new code also protects against memory allocation exceptions inside calls to Thrust, by creating a temporary padding allocation which reserves some memory for use by the thrust sorting method. Once we have succeeded in allocating all of the required GPU working buffers, the volumetric density map and texture map are computed progressively in a loop over "slabs" that fit within the GPU memory buffers, copying results back to the host, until the entire volume has been computed. This implementation is also the first step toward support for the older GT200 hardware which are incapable of launching 3-D grid of thread blocks and must instead process a single 2-D plane of thread blocks per kernel launch. Once the GPU marching cubes implementation is incorporated, we must account for the GPU memory required by the subsequent marching cubes processing and the final array of triangles.
- Updated win32 builds with QuickSurf code.
- Compute the CUDA QuickSurf atom bin acceleration grid dimensions and spacing based on the gaussian truncation radius set by the user's graphical representation parameters.
- Merged in various tweaks from the experimental version of the QuickSurf code. Added improved error handling behavior when the GPU runs out of memory and similar scenarios.
- Reorganized parameters to the top level GPU QuickSurf routine. Reordered computation blocks in the GPU density kernels to encourage better register allocation by the compiler.
- Replace independent reads of uint2 components of the cell start/end info with a single uint2 read.
- Merged QuickSurf cellStart and cellEnd arrays into a single array of uint2 type, to reduce the number of allocation/deallocation calls we have to make per frame, and to potentially save a few GPU registers used for 64-bit pointers on Fermi.
- Output less info to the console when running on the GPU
- Increased GPU timer details for benchmarking purposes.
- Added runtime checks to cause CPU fallback for GPU hardware with compute capability less then 2.0. Added additional compile-time checks to ensure that the new CUDA QuickSurf kernels are compiled against CUDA 4.0 or later.
- cranked version
- VMD 1.9.1a11 (October 17, 2011)
- Corrected pick point tag index calculations for sparse atom selections with the new optimized atom selection implementation
- Implemented further unrolling of the textured gaussian density kernel, thereby increasing performance by roughly 30% (on a GeForce 560M).
- Separated loop unrolling macros for textured and non-textured gaussian kernels
- Unrolled the linear-time CUDA density kernels so each thread produces two lattice points rather than one, and played some tricks with shared memory to accomodate the resulting increase in register use.
- Enable CUDA QuickSurf density calculation by default for the time being.
- Wrote initial CUDA versions of the linear-time gaussian density calculation kernels. The current code only computes one density/color per thread, and uses an overly large atom bin size, but it already outperforms the fastest multithreaded CPU code by a factor of 3 to 10 depending on the test case.
- Added an implementation of direct unbuffered disk I/O for Win32, enabling peak I/O rates for SSDs and fast RAID arrays.
- Fixed QuickSurf thread tile processing loop bounds
- Changed representation order so QuickSurf is listed before MSMS
- Improved naming and default values of QuickSurf controls
- Increased QuickSurf grid padding so we don't clip surfaces for small isovalues
- Added isovalue control for QuickSurf rep based on Johan's feedback
- Reduce QuickSurf console output since the code is now multithreaded...
- Added code to the QuickSurf rep to limit CPU core counts so we don't use more than 2GB of memory for grid generation.
- Limit QuickSurf rep to 8 CPU cores so we don't run out of memory and to prevent hyperthreading from negatively impacting performance.
- Added multithreading to the CPU QuickSurf implementation.
- gamessplugin: Enabled reading of PCGAMESS/FIREFLY output files with a warning. Added code to have_gamess to recognise Firefly output and version number. Modified get_scfdata to deal with dos-formatted \LF\CR Firefly output files.
- Optimized the gaussian density computation loops even further by incorporating the shift to base 2 and negation of the exponential parameter into the invrad precomputations. Began reorganizing the code for the multithreaded CPU implementation.
- viewchangerender: minor bug fix
- Added isovalue parameter to QuickSurf CUDA kernel call.
- Added infrastructure to allow user-determined isovalue controls useful for making smoother surfaces.
- Rewrote the CUDA direct gaussian summation loops to allow processing of multiple planes per grid, specifically targeting the 3-D grids feature of Fermi and CUDA 4.x. Simplified the innermost loops somewhat and improved multi-GPU performance.
- Added prototype CUDA direct gaussian summation kernels for both density-only and density with volumetric texturing calculations for the QuickSurf representation. Reformulated the host code and CPU algorithm to use atom coordinates relative to the grid origin for improved efficiency and reduced GPU register consumption. The prototype CUDA kernels are simple quadratic time implementations currently, but they are useful both for development of the innermost loop of the linear-time approach that will be implemented next, and to determine peak throughput for the arithmetic heavy parts of the algorithm.
- Improved the QuickSurf bounding box computation heuristic significantly
- cranked version
- VMD 1.9.1a10 (October 7, 2011)
- Enable point sprite shader by default, for beta testing
- Fix uninitialized variable in file loader GUI update handler, needs further testing
- Use unit scaling for atom radii by default in QuickSurf rep
- Enabled color texturing of QuickSurf reps
- Completed density-based coloring routine for QuickSurf rep
- Misc corrections to arithmetic in isosurface per-vertex texturing routines
- Implemented interpolation for per-vertex coloring of isosurfaces from volumetric RGB textures.
- Added new prototype isosurface vertex texturing APIs
- Added prototype of density-based coloring method for the QuickSurf rep.
- Started rearranging density computation in preparation for adding color calculations based on fractional density values.
- Simplify conditional compilation tests
- Don't draw QuickSurf bounding box anymore, since we've got it debugged now.
- Added a new GUI control to set the quality of the QuickSurf density map calculation, by selecting the default gaussian window size over a range from 2r for the lowest quality mode, up to 4r for the maximum quality mode. This prevents visible discontinuities from appearing when using a very fine grid spacing with very large radius multipliers
- Corrected calculation QuickSurf of gaussian window sizes with varying grid spacing
- Added post-adjustement correction of QuickSurf grid dimensions, so that the final grid side lengths are correct despite any required padding/rounding/etc that has to be done to accomodate CUDA, SSE, etc.
- Added index-based variation of the gradient lookup macro
- Pulled outer grid loop into the special-case isosurface routine to reduce overhead a bit.
- Added special case handling for axis-aligned grids with unit vector cell directions, since this simplifies the normal computations considerably.
- misc cleanup of isosurface extraction code before doing some specialization to give higher performance in some common cases
- Improved QuickSurf CPU performance further with two optimizations. Added new test to see if cutoff radius is exceeded in the Y-Z plane, to allow early-exit before entering the innermost loop. Removed the zero clamping behavior of the exponential approximation since we should never encounter this case due to the range-limited per-atom loops in the density map calculation code.
- Use half-Angstrom spacing as the lower bound for the QuickSurf rep for now.
- Use fully-inlined exponential approximation rather than calling a function
- further optimization of density map loop in QuickSurf rep
- Use fast expf() approximation for QuickSurf density map calculation
- changed timers and memory allocation/deallocation approach for QuickSurf
- further tuning of basic CPU gaussian surface generator loops
- Change default grid spacing for QuickSurf to 1.0A so 1M atom structures display in about a second by default. Once various optimizations are in place, we may be able to use a finer resolution 0.5A grid by default.
- Added first draft of CPU-based version of fast gaussian surface rep
- autopsf: Fix ifdefs so they work with Visual Studio 2010 as well as older revs
- VMD 1.9.1a9 (October 4, 2011)
- FreeVR: Fix default FreeVR wand position offset and coordinate scaling
- Query Euler angles from FreeVR wand and use them for pointer orientation
- propka: Added new propKa GUI plugin developed by Michal Rostkowski et al., in Jan Jensen's group.
- Added phone/tablet control classes to Windows build.
- offplugin: Added OFF reader plugin contributed by Francois-Xavier Coudert.
- MacOS X 32-bit/64-bit plugin compilation flag fixup to ensure correct binaries are compiled regardless of ABI of the host machine.
- readcharmmtop: fixed issue with comments in topology file
- multiplot: changed the behaviour of mouse-over-points in order to report exact coordinates in the original data points, rather than "free-hand" ones derived by the current pointer position.
- readcharmmpar: fixed annoying bug when there are no remarks in the parameter file
- Changed the behavior of VMDApp::molecule_new() so that it can optionally skip generation of callback events, for the specific case where we call molecule_new() immediately prior to filling the molecule with structure data we have loaded. This eliminates a redundant MOL_NEW command from propagating to the GUI code where it can cause quadratic runtime behavior when loading tens of thousands of molecules, by making the GUI completely regenerate the molecule choosers at each file load. This will need extensive testing to ensure that the callback sequence is still correct with this optimization in place.
- Enabled the first half of the GUI optimizations for loading huge numbers of molecules at a time.
- Added prototype code to optimize regneration of VMD molecule choosers
- Replaced old SaeTrajectoryFltkMenu GUI code that was looping over molecules to find the MoleculeList index containing the requested molecule ID. This loop was causing quadratic runtime while loading tens of thousands of molecules with the standard windowed display device mode (i.e. not -dispdev text). The new code calls VMDApp::molecule_index_from_id() which uses the hash table built into MoleculeList to avoid poor scaling.
- Added new VMDApp::molecule_index_from_id() method to eliminate another place where old GUI code was looping over molecules to find the index matching an ID, rather than using the MoleculeList hash table acceleration structure. The new method is easy for the GUI code to use, and makes direct use of the MoleculeList hash table.
- Changed the behavior of SaveTrajectoryMenu::update_molchooser() to avoid regenerating molecule list on rep add/del/rename events that don't actually change the status of any of the molecules.
- The "mol new" and "mol new atoms" commands now use the more efficient molecule auto-naming functionality built into VMDApp::molecule_new() which avoids excessive GUI chooser regeneration.
- Improved behavior of "mol new atoms" with respect to GUI update callbacks, by performing the molecule rename during construction rather than as a separate step. This allows a revised fill_fltk_molchooser() implementation to avoid quadratic time complexity for the case where a large number of molecules are added.
- Eliminated dynamic memory allocation from the molecule menu generation code, and added code to truncate molecule names to at most 25 characters as shown in the GUI menus.
- Started rewrite of fill_fltk_molchooser(), beginning by absorbing the caller's clear() calls and generation of any extra menu labels. By moving the logic for these into the routine, we can subsequently rewrite the regeneration code to optimize the "add one molecule" case and eliminate the quadratic scaling behavior currently caused by regenerating the entire molecule list every time. The quadratic behavior slows down loading of tens of thousands of molecules (e.g. loading the whole PDB). Although this is an unusual use case, it would be nice to be able to do it with the GUI fully enabled and without any environment variable tricks.
- Fixed the tool menu so that it also updates/regenerates the molecule list chooser when molecules are renamed.
- prep for rewrite of fill_fltk_molchooser()
- Tweaks to stencil-based stereo modes
- Added in the benchmarking environment variable hack for measuring VMD trajectory I/O rates without interference from the viewing volume calculation, so that one can do the benchmarks more conveniently on large structures without having to first load coordinates before loading a trajectory.
- jsplugin: Updated with latest bandwidth number for SSD RAID test with direct I/O
- irspecgui: correct bug when reading charges from external file. bug reported by andreas kukol. step plugin version number to 1.2.
- jsplugin: Added GPU benchmarking code to jsplugin for use in benchmarks for the OOC immersive viz paper
- hesstrans: fix Makefile rules for tcl_hesstrans.C
- fix python material change tuple argument format list
- paratool: rename hessiantransform to hesstrans to match the new tcl interface.
- hesstrans: Implemented proper TCL interface in order to get rid of the SWIG wrapper which seems to cause segfaults from time to time.
- misc formatting cleanup while checking on an old bug
- lammpsplugin update part 3: update documentation for the plugin to match the current file format and describe supported features.
- move comment about writing out PBC info to correct location.
- Added a new API and associated script command to set the incoming network port VMD listens on for mobile phone/tablet devices.
- Prevent the animate mode from also scaling or rotating. Fixed another nit with handling failed UDP listener initialization.
- Handle cases where two VMD instances both try to open the same mobile port, or the port is already in use.
- Allow phones and tablets to control trajectory animation using the touchpad
- Added new "mobile" script commands to control whether or not VMD listens for incoming mobile phone/tablet/touchpad device input, and to control sensitivity etc.
- lammpsplugin update part 2: add support for element, mass and diameter field. improve handling of box size output and flag as periodic for box length greater than 0 or else flag as shrinkwrap BC and use min/max coordinates for box. update output format to the current style.
- lammpsplugin: update part 1: optional support for dipolar particles by translating them into two atoms around the center with the distance scaled by the value stored in LAMMPSDIPOLE2ATOMS, which also enables this feature. recognize more field types in preparation for the second part of the update.
- Replaced old smartphone tracker/button subclasses with new WiFi mobile device subclass that links into the VMDApp MobileInterface object.
- Added mobile device tracker/button subclasses based on access to the MobileInterface object in VMDApp.
- Eliminate old Mobile WiFi UDP packet receiver code in favor of calling VMDApp to query orientation data, as is done for the Spaceball.
- Tweak multitouch distance threshold to determine when we have a pinch vs. a drag, reducing cutoff distance for drag to a separation of 0.65 inches or more.
- Added VMDApp::draft mobile_get_tracker_status() implementation.
- Misc cleanup of WiFi mobile input device code. Deleted various testing code, updated comments, and begin preparing for further code refactoring now that the network protocol isn't changing as rapidly.
- Fixed compiler warnings reported by the LLVM/Clang team and updated plugin minor version numbers. These warnings were of the form: warning: format string is not a string literal (potentially insecure) [-Wformat-security] printf(msg);
- Added some mobile event filtering code to prevent visible jitter from tiny changes in scaling and Z rotation angle when using "pinch" gestures.
- Updated mobile interface to new protocol, and now making use of dot pitch so we interpret touch events for scaling and translating in absolute distances so we can behave better when the same code runs on both a tablet device and a phone.
- Added mobile device WiFi interface class for smartphones, tablets, and other wireless devices with touchpads, 6DOF sensors, and displays.
- Use float4 types for the built-in CUDA memory bandwidth benchmark
- Updated comments in the JS plugin header to reflect the current state of the design and implementation, along with comments about opportunities for future use of lio_listio() for discontiguous gather operations on multiple timesteps, etc.
- moldenplugin: MOLDEN writes the basis set coefficients using Fortran style notation where the exponential character is 'D' instead of 'E'. Other packages adhere to C-style notation. Unfortunately sscanf() won't recognize Fortran-style numbers. Therefore we have to read the line as string first, convert the numbers by replacing the 'D' and then extract the floats using sscanf().
- moldenplugin: Print Warning instead of error when no basis set was read from a QM logfile. Just coodinates are fine.
- moldenplugin: Gracefully handle files containing spherical harmonic functions: We just read the coordinates instead of bailing out.
- paratool: Removed unnecessary output.
- Changed MolFilePlugin::write_structure() to write bonds not only when the BaseMolecule::BONDS flag is set, but also if the user has modified or validates the bond order and/or bondtype info, causing the BaseMolecule::BONDORDERS or BaseMolecule::BONDTYPES flags to get set.
- Implemented a special-case fast-path for radid trajectory animation for the "points" representation. The fast-path builds the rep logic into a customized DispCmdPointArray::putdata() method that eliminates two memcpy() calls. This optimization only matters for very large structures.
- Simplify the workload for the pick point generation code by taking advantage of firstsel/lastsel.
- Updated molecule browser GUI to avoid truncating atom count info for 100 million atom structures.
- Check return state of Antechamber runs and pop up message if there was an error
- molefacture: Fixed calculation of formal charge for R1-(PO4-)-R2
- paratool: Include VMD version and paratool version into Paratool project files.
- paratool: Removed unused variable.
- paratool: Fixed the atom property indexing bug reported by Brian Bennion.
- cranked version
- VMD 1.9.1a8 (June 16, 2011)
- Eliminated the DDATABLOCK display command used by the old indexed display commands (DSPHERE_I, DPICKPOINT_I, DPICKPOINT_IARRAY).
- Eliminated the old DPICKPOINT_IARRAY display command, replacing it with DPICKPOINT_ARRAY, which copies both the pick point coordinates and indices into the display command, rather than referencing coordinates stored in a prior DDATABLOCK command. Eliminated the old DSPHERE_I display command. Eliminated the old (and currently unused) DPICKPOINT_I display command. The indexed display commands are now responsible for a performance bottleneck when rendering 100 million atom trajectories, because they require the whole coordinate block for a timestep to get copied into a matching DDATABLOCK array. This approach was convenient many years ago when system sizes were small and it was common for representations to show much or all of a structure, but with the huge structures of today, the exact opposite is now the case. To this end, all of the indexed display commands now need to be eliminated and/or replaced by versions that copy in only the coordinates they need. This copy-as-needed approach has much better scaling behavior in typical use cases, and the overhead is tied directly to the atom selections the user makes, so they can easily adjust to achieve the desired performance.
- runante: Fix to make sure bonds are written to antechamber input files
- Fixed bug in trajectory smoothing introduced during firstsel/lastsel updates
- Updated VolMapCreate to use firstsel/lastsel indices to accelerate atom selection processing
- Updated Tcl atom selection commands to use the firstsel and lastsel indices to accelerate selection get/set operations. This gives a huge performance boost when working with large structures in cases where selections are relatively small and compact ranges of selected atom indices.
- runante: Improved handling of antechamber errors for poor structures
- molefacture: Cleaned up debug code
- molefacture: Fixed update_openvalence code for calculating formal charges
- Updated measure commands to use const AtomSel pointers where possible.
- Marked AtomSel::coordinates() and AtomSel::timestep() as const methods. Although the coordinates and timestep data could potentially be changed by the caller, the atom selection object itself is constant. This should make the code more readable and safer.
- Updated MeasureSurface to use new atom selection range indices and did various cleanup work while I was at it.
- First pass of updates for measure commands to take advantage of the atomselect firstsel and lastsel indices to speed up the loops that evaluate selections.
- cranked version
- VMD 1.9.1a7 (June 14, 2011)
- Updated the FreeVR arena allocation code to try and loop over arena sizes if the initial allocation fails.
- vtfplugin: Updated URL for espresso MD package
- return an error if the user executes an unrecognized parallel subcommand
- Fix "parallel allgather" argument parsing for the non-MPI builds so that errors are caught and reported in the 1-node case the same way they are for the MPI case.
- jsplugin: Prevent string tables from being leaked during read_js_structure() call.
- Increased default CAVElib and FreeVR shared memory allocation to 2 GB.
- Duplicated some of the OpenGL renderer member variable initializations normally handled by the OpenGLRenderer superclass due to differences in the order of FreeVR initialization vs. regulard windowed OpenGL.
- Misc cleanup of the button superclass constructor. Fixed a problem with initialization of the button array caught by a detailed run of valgrind with a FreeVR build.
- Fixed uninitialized multisample antialiasing flag in the OpenGLRenderer class, found by valgrind.
- cranked version
- VMD 1.9.1a6 (June 9, 2011)
- Updated Linux and MacOS X builds to use CUDA 4.0 by default.
- Updated FreeVR shared memory initialization code for new versions of the API
- Enabled point sprite sphere shader code, but added a runtime environment variable check to make testing easy while preserving historical behavior until GUI code is updated.
- cranked version
- VMD 1.9.1a5 (June 9, 2011)
- Some miscellaneous reorganization to prepare for changing the atom selection loops to use the firstsel/lastsel indices.
- Ensure that FreeVR builds pack the atom coordinate array into shared memory the same way that we do for CAVElib.
- paratool: Fix call to non-existent "hesstrans" proc so it calls "hessiantransform".
- Fixed button indices for recent revs of FreeVR
- Fix atom secondary index arithmetic for new firstsel/lastsel loop logic.
- Updated the trajectory coordinate smoothing loops in DrawMolItem::do_create_cmdlist() to take advantage of the firstsel and lastsel indices to avoid smoothing atom coordinates that aren't included in the selection associated with a representation. The code is in place but currently disabled, because there's one more catch: we have to ensure that the active graphical representation will not reference any atom coordinates other than those required for the selected atoms. In cases where a representation may need access to arbitrary atom coordinates, we have to perform the smoothing loop for all atoms, and not just the selected subset. This will require some further checks when setting the loop bounds.
- Rewrote the DrawMolItem atom selection processing loops to take advantage of the firstsel and lastsel indices to avoid needlessly traversing the whole flag array. This gives a large performance boost in typical usage when working with very large solvated structures with tens of millions of atoms.
- The atom selection class now computes the indices of the first and last selected atoms, in addition to the count of selected atoms, as the last step in AtomSel::change(). With this little bit of additional information, we can greatly improve the average case performance of many of the graphical representation drawing methods and "measure" commands when working with large structures with millions of atoms. With this information, loops that walk over individually selected atoms can avoid traversing the whole flags array and instead walk only the contiguous span that includes the first and last selected atom(s). This avoids thrashing the L1/L2 caches when working with large structures and eliminates a big portion of the linear-time selection loop in most cases. We could go much further and handle small-sized sparse atom selections as a special case, by storing list of selected atom indices, to be used rather than the traditional search loop, when the list of selected atoms is small (say 1,000 atoms or so). The cost of such an approach is that every part of the code that traverses the atom selections would have to be modified with logic to take advantage of the optional index list acceleration scheme, and this could get to be quite ugly unless we come up with some C++ tricks to abstract the details and make it look more like a generic iterator of one kind or another. The trick is to make such a scheme performance competitive with a plain for loop.
- Don't use the pthreads versions of FreeVR until the Scene subclasses have been made bulletproof against race conditions.
- Updated default vmdmovie assumption for VideoMach installation directory to match the current VideoMach installer default, as of VideoMach 5.8.5.
- topotools: use smarter heuristics to determine charge groups and avoid the gromacs limit of 32 atoms per charge group.
- Updated extensions test code with geometry shader feature test.
- Eliminated fallback ifdefs for various extension macros, since they are in the latest Khronos headers.
- Updated OpenGL extension headers with latest revs from Khronos
- Added conditional compilation and runtime checks for the availability of the GL_ARB_point_sprite extension used by the new point sprite sphere rendering code.
- Added code to support scaling and perspective correction of point sprite based sphere renderings.
- Latest version of parseFEP from Chris Chipot.
- Enabled faster point rendering loops for use in displaying very large systems with many millions of atoms. This is particularly beneficial for the out-of-core trajectory display feature and the new GLSL point rendering mode.
- Added shader code for the billboarded point sprite sphere rendering code.
- Added test version of point-sprite based sphere rendering code to the existing "Points" representation. This implementation is based on a billboard style point sprite implementation that is roughly 10 times faster than the existing GLSL sphere implementation, but does not currently try to generate perspective-correct spheres. This approach doesn't require geometry shaders or tessellation other advanced GLSL features, so it should still work on old hardware.
- Increased display list memory allocation alignment to 16-byte alignment, to support types like long long, and the various 128-bit SSE types.
- Changed VMDDisplayList class to use long ints for block sizes. Now that we're working with structures with hundreds of millions of atoms, the old code could encounter integer overflow when rendering single display commands containing hundreds of millions of atoms, e.g. the point array and sphere array display commands.
- namdplot: Handle memory usage in MB (from NAMD 2.8) or GB correctly.
- Changed representation generation code to early-exit the selection test loop as soon as it has found the expected number of selected atoms. This, combined with matching changes in the pick point code give us a factor of 12 or more performance benefit when we are doing trajectory animation of the 116M atom BAR domain system. (the old Blue Waters benchmark prototype system)
- Changed pick point generation code to early-exit the selection test loop as soon as it has found the expected number of selected atoms. This, combined with matching changes in the DrawMolItem methods give us a factor of 12 or more performance benefit when we are doing trajectory animation of the 116M atom BAR domain system. (the old Blue Waters benchmark prototype system)
- cranked version
- VMD 1.9.1a4 (May 23, 2011)
- jsplugin: Added code to test selective per-timestep reads, for benchmarking of VMD test code that can skip past bulk solvent when loading trajectories, if requested by the user.
- Print frame load rate with more precision for easier benchmarking.
- jsplugin: Updated jsplugin header with latest benchmark results: 2006 MB/sec on the SSD RAID0 setup.
- Enabled block-aligned Timestep coordinate memory allocations so we can use direct I/O in the newest version of jsplugin.
- jsplugin: Enabled direct I/O for Linux builds using O_DIRECT for files that have block-based timestep structure. This raises our best I/O bandwidth from 1203 MB/sec up to 1864 MB/sec for a million atom STMV structure.
- jsplugin: use block padding logic in all cases, even when block size is 1
- jsplugin: Added the necessary code to enable writing block-based trajectory files that support direct I/O techniques.
- Enable the use of block-based direct I/O for trajectory reader plugins. This requires allocating timestep buffers that are padded to an even multiple of the block size, and aligning the starting address of the timestep pointer so it points a block address boundary. This is necessary because most operating systems require both I/O transaction sizes and the source/target memory buffers to be sized and aligned in this way. The benefit of the direct I/O approach is that the trajectory plugins will completely bypass the host OS kernel filesystem buffer cache, which makes it a zero-copy I/O operation, thereby enabling us to achieve MUCH higher I/O rates than are otherwise possible. A secondary benefit of bypassing the OS kerne buffer cache is that on system like Linux that are hyper-agressive in using physical memory to cache filesystem accesses, direct I/O prevents the machine from going into a frenzy paging out other applications to make space for the cache disk blocks. This is a big win for out-of-core visualization scenarios where we may trivially read through tens or hundreds of gigabytes as we watch a trajectory being read from a fast RAID or an SSD. This should enable us to exceed 2 GB/sec I/O rates in VMD for the first time.
- jsplugin: Implemented block-size aware trajectory reading code to allow us to begin using O_DIRECT style kernel buffer cache bypassing I/O ops in jsplugin.
- fastio: Prevent multiple inclusion from creating problems with symbol redefinition
- jsplugin: Began adding block-based timestep I/O code to enable higher performance on fast SSDs and SSD RAIDs
- dcdplugin: Moved the include of fastio.h forward so that the O_DIRECT flags get defined on Linux.
- fastio: Added new file mode constants and code to enable support for direct I/O when reading or writing large trajectory files. This enables carefully written plugins to be able to bypass the host OS kernel buffer cache completely, and can result in a 1.5x to 2x performance boost when reading from fast SSDs and SSD RAIDs.
- fastio: Added a new FIO_DIRECT flag to enable support for O_DIRECT style I/O that bypasses the kernel buffer cache for higher performance on super fast SSD RAIDs and the like.
- jsplugin: Updated comments regarding block-based direct I/O in prep for new code.
- jsplugin: Make latest jsplugin behave correctly in tools like catdcd or the out-of-core VMD prototype, which don't read structure data before pulling in timesteps.
- jsplugin: Added a "skip-past" version of the structure parsing code to enable tools like catdcd and the out-of-core VMD prototype to skip calling read_js_structure() and still get to the correct timestep file offset on the first call to read_js_timestep(). The automatic behavior isn't enabled yet, but the seeking code is now in place.
- jsplugin: More cleanup in preparation for adding support for block-based trajectory reads using direct-I/O (bypasses kernel buffer cache), for a roughly factor of two performance gain on SSD RAIDs (2020 MB/sec!).
- jsplugin: Miscellaneous cleanup needed to compile this code within a heavily modified VMD IMD subclass for testing asynchronous I/O approaches for out-of-core rendering and analysis.
- cispeptide: Added citation now that the paper describing this plugin is in press.
- chirality: Added citation now that the paper describing this plugin is in press.
- vmdmovie: Updated vmdmovie version number
- vmdmovie: Fix VideoMach frame counters
- topotools: improve selection strings to not break on unusal atom names.
- topotools: do better checking of .xyz format conformant first line. avoids spurious empty frames and other problems.
- Added pope36 membrane template to the membrane plugin.
- Compacted molfile plugin fast I/O macros in prep for adding new APIs to support direct I/O (bypass kernel fs page cache) on various platforms.
- Check for GL_ARB_sample_shading extension, used for antialiasing of the interior of fragments textured by GLSL shaders, e.g. the VMD sphere shader.
- Added comments about GLSL version requirements that arise with the future use of the GLSL GL_ARB_sample_shading extension.
- parseFEP: Updated parseFEP plugin with latest version from Chris Chipot.
- Updated MacOS X and Windows builds to use CUDA 4.0 RC2.
- cranked version
- VMD 1.9.1a3 (April 14, 2011)
- Fix default temporary file directory, file load/save dialog, and window placement for 64-bit MacOS X versions.
- Fix default web browser for 64-bit MacOS X builds.
- Continued cleanup of SDL and FLTKOPENGL flags
- More CUDA 4.0 updates for MacOS X builds, adding NVCC compilation flags that weren't supported in the older 2.x revs of CUDA. Updated handling of the FLTKOPENGL option so it works more cleanly with both Linux and MacOS X builds.
- Did some testing and updating of the FLTK Fl_Gl_Window subclass we use for the OpenGL display window on platforms such as MacOS X. As of FLTK 1.1.10 multisample antialiasing is still unimplemented for Fl_Gl_Window subclasses on MacOS X and Windows.
- molefacture: Fixed a bug to correctly find C* atom types in the OPLS parameter file.
- runante: Fixed a bug that caused antechamber to crash when using windows with GAFF atomtyping.
- runsqm, runante: Alternative way of checking for env(AMBERHOME) that works with windows
- Enabled CUDA 4.0RC for 64-bit MacOS X builds.
- topotools: fix bug in 'topo guessatom' reported by xue yang.
- Updated Windows builds of VMD to CUDA 4.0RC.
- cranked version
- VMD 1.9.1a2 (April 4, 2011)
- Updated MacOS X builds of VMD to CUDA 4.0RC, required for 64-bit support.
- Enable Intel C/C++ v12.0 for Linux VMD compiles.
- topotools: updates to inline documentation, some cleanups, stepping version number.
- topotools: bugfix for topo guessatom. guessing element from mass was off. added guess name from type. and made element lookups case insensitive.
- molefacture: Increased menubar widths to display correctly in 64bit OS X builds. It seems the number of characters + 4 makes it large enough.
- molefacture: Made runsqm and runante silently check for antechamber and sqm instead of interactively.
- Add a checkbutton for the new rotation-based reverse coarse graining
- Updated the ICC compiler flags for the latest Intel C/C++ "XE" compilers version 12.
- Added ability to do additional rotations of groups to optimize reverse coarse graining
- cranked version
- VMD 1.9.1a1 (March 31, 2011)
- Enabled compilation of plugins that use sqlite and libexpat for 64-bit MacOS X builds of VMD.
- Fixed a bug to allow the path to antechamber OPLS_ATOMTYPE.DEF to contain spaces
- mutator: Fix inconsistencies in mutator version numbers, and cranked to next version for recent updates.
- Enabled LIBTACHYON for 64-bit MacOS X builds.
- Enabled Tk and ACTC for 64-bit MacOS X builds
- Enable Tk for 64-bit MacOS X builds.
- When compiling for 64-bit MacOS X, we must initialize Tcl/Tk prior to initializing FLTK. The Cocoa-based FLTK and Tk implementations required for 64-bit MacOS X verisons of VMD can conflict with each other due to their subclassing of NSApplication. With a patched FLTK, this problem can be avoided, but in order for that to work, FLTK must be initialized after Tcl/Tk. The new code reverses the order of FLTK and Tk initialization for 64-bit MacOS x builds, but leaves it alone for all other platforms.
- Cranked molefacture version forward to 1.3 since it has been changed since the last release.
- molefacture: Added +4 oxidation state for sulphur
- updated comments regarding fall-back initialization of the scene and display objects.
- Changed text interpreter constructors to make it much easier to reverse the order of FLTK and Tk initialization. This is currently required for 64-bit builds of VMD using Cocoa-based FLTK and Tk toolkits. With a special FLTK patch (hopefully going into the trunk soon), Cocoa-based implementations of FLTK and Tk for 64-bit MacOS X can be made to coexist. One complication with making them cooperate with each other is that Tcl/Tk must be initialized before FLTK, otherwise bad things happen deep inside the Cocoa subclasses that manage the "NSApplication" subclasses. At present VMD likes to wait to initialize Tcl/Tk until after the display device has been created, but this won't be the case for 64-bit MacOS X builds of VMD using Cocoa, so we need to remove the display device query out of the Tk initialization code and pass in the GUI state as a parameter, since we'll have to determine this outside of the display device code and in a different order than before.
- Fix FLTK size_range() calls.
- Recognize "(AU)" in units strings emitted by Molcas. Molden itself is extremely permissive and apparently accepts any string containing "AU", so we may want to change the parser to use the same string matching method used by Molden itself.
- mutator: fixed issue with comment in GUI
- Make GeometryMol::calculate_all() iterate over the maximum number of frames of any of the molecules involved in a label, rather than looping over the number of frames in the first molecule (associated with the first atom in the label). This is a must in order to get correct behavior for multi-molecule labels.
- Applied Justin's patch to TkCon to prevent it from sourcing command line args as script files at startup
- Fix a place where the ring finding code was leaking a temporary hash table
- Eliminated a memory leak in the carbohydrate ring finding code
- Cache the current text label offset in FileRenderer base class member variables, so that subclasses can easily get at these values later.
- viewchangerender: Updated the version number of ViewChangeRender since the latest changes go beyond the final 1.9 release.
- molefacture: The spheres in the documentation image are now shown with new (higher) default resolution.
- viewchangerender: Force variable passed to animate goto commands to be integers
- cranked version
- VMD 1.9 Final Release (March 14, 2011)
Please email any questions to vmd@ks.uiuc.edu.