Parallel Execution for Large Problems
APBS requires approximately 200 B memory per grid point. Memory usage can be predicted before performing the calculations using the Python script provided with APBS. If it appears your calculation is going to require more memory than is currently available on your system, you have a few options:
- APBS calculations can be performed in parallel across multiple processors (hopefully, sharing distributed memory!). This functionality is provided by the mg-para keyword which is described in more detail in the APBS user guide and below.
- APBS calculations can be broken into a series of smaller, asynchronous runs which (individually) require less memory. This functionality is provided by the mg-para async keyword which is described in more detail in the APBS user guide and below.
- Submit your calculations through the APBS Opal client as described in the APBS user guide to use external computational resources.
- Submit your calculations through a web interface to use external computational resources as described in the “How do I run my calculation on someone else’s computer?” section.
Synchronous Parallel Calculations
The actin dimer example provided with the APBS distribution (examples/actin-dimer/complex.pqr) is a fairly large system that can often require too much memory for some systems. As such, it is a good example for parallel focusing calculations. This example will use the actin dimer complex PQR file complex.pqr.
We’re going to use an 8-processor parallel calculation to write out the electrostatic potential map for this complex. Each processor will solve a portion of the overall problem using the parallel focusing method on a 973 mesh with 20% overlap between meshes for neighboring processors. An example input file for this calculation might look like:
where the pdime 2 2 2 specifies the 8-processor array dimensions, the ofrac 0.1 specifies the 20% overlap between processor calculations, and the dime 97 97 97 specifies the size of each processor’s calculation. The write pot dx pot instructs APBS to write out OpenDX-format maps of the potential to 8 files pot#.dx, where # is the number of the particular processor.
Running this input file with an MPI-compiled version of APBS runs 8 parallel focusing calculations, with each calculation generating fine-scale solutions on a different region of the (fglen) problem domain. Note that 8 separate OpenDX files are written by the 8 processors used to perform the calculation. Writing separate OpenDX files allows us to avoid communication in the parallel run and keeps individual file sizes (relatively) small. Additionally, if a user is interested in a specific portion of the problem domain, only a few files are needed to get local potential information.
However, most users are interested in global potentials. For some programs (OpenDX, DataTank), the individual potential files can simply be read into the program separately and the program will reconstruct the global map. Most other programs will require the user to reassemble the global map first; APBS provides the mergedx program for this purpose. mergedx is a simple program that allows users to combine several OpenDX files from a parallel focusing calculation into a single map. This map can be down-sampled from the original resolution to provide coarser datasets for fast visualization, etc. For example, the command
will generate a file gridmerged.dx which has downsampled the much larger dataset contained in the 8 OpenDX files into a 653 file which would be suitable for rough visualization. An example of mergedx output visualization (see the How do I visualize the electrostatic potential around my biomolecule? section for more information about visualization) is shown in the attached figure. Note that downsampling isn’t necessary – and often isn’t desirable for high quality visualization or quantitative analysis.
Asynchronous Parallel Calculations
The steps described in the previous section can also be performed for systems or binaries which are not equipped for parallel calculations via MPI. In particular, you can add
to the ELEC mg-para section of the APBS input file to make the single-processor calculation masquerade as processor n of a parallel calculation.
Scalar maps from asynchronous APBS calculations can be combined using the mergedx program as described above. Currently, energies and forces from asynchronous APBS calculations need to merged manually (e.g., summed) from the individual asynchronous calculation output. This can be accomplished by simple shell scripts.
As a specific example, we can modify the input file above to include an async 0 command in the ELEC statement and thus cause APBS to perform the operations of the first processor in the parallel focusing calculation. The modified input file should look like:
This should create an OpenDX-format potential map called pot.dx, corresponding to the output from processor 0 in a parallel focusing calculation. Performing additional APBS calculations with async 1, async 2, …, async 7 will generate the corresponding OpenDX maps for the remaining processors of the parallel focusing calculations. These can then be reassembled with mergedx as described above.