The CUBE file is a common format for storing molecular geometric and volumetric field data from quantum/computational chemical calculations. It originates from the Gaussian software package.[1Gaussian 09] The official specification of the CUBE file format, sanctioned by Gaussian, Inc., is described on the Gaussian webpage[L1Website of the Gaussian, Inc.] as part of the document on the “cubegen” utility.[L2“cubegen”]
Format specification can be found on Paul Bourke's webpage,[L3Bourke, P. “Gaussian Cube Files”] on the webpage of the VMD visualization program,[L4“Cube Plugin, Version 1.1.”] and as part[L5Skinn, B. “Gaussian CUBE File Format”] of the online document on the binary “h5cube” file[L6Skinn, B. “h5cube File Specification”] based on the HDF5 format.[L7“HDF5: a data model, library, and file format for storing and managing data”] These webpages provide a cleaner layout of possible arrangements of CUBE file contents. In particular, the Gaussian specification is ambiguous about whitespace requirements, so parsing of CUBE files should accommodate some variation in the format, including (i) variable amounts/types of whitespace between the values on a given line, and (ii) the presence of leading and/or trailing whitespace on a given line.
A description of the contents of a representative subset of CUBE files in circulation are reported here. The extensions that are specific to PAMoC are highlighted. Files formatted to PAMoC specification may not be compatible with all software supporting CUBE file input.
The file contains plain text (human readable) and consists of a header which includes the atom information and the size as well as orientation of the volumetric data. This is followed by the volumetric data, one or more values per voxel element.
The first two lines of the header are comments, they are generally ignored by parsing packages or used as two default labels. In VMD,[L4“Cube Plugin, Version 1.1.”] by convention the first line is typically the title of the system and the second line is a description of the property/content stored in the file. For robustness, both of these lines should not be zero-length. As well, while there is no defined maximum length for either of these lines, both should not exceed 80 characters in length.
The third line has the number of atoms (NAtoms) included in the file followed by the position of the origin (X_{0}, Y_{0}, Z_{0}) of the volumetric data and by the number of values (NVal) per voxel.
A negative value of the number of atoms here indicates that the CUBE file contains one or more rows of integers representing identifiers associated with multiple data values at each voxel. A positive value indicates that the file does not contains these lines. The absolute value of the number of atoms defines the number of rows of molecular geometry data. The CUBE specification is silent as to whether a zero value is permitted for the number of atoms; most applications likely do not support CUBE files with no atoms.
If the number of atoms is positive, NVal indicates how many data values
are recorded at each point in the voxel grid; it may be omitted, in which
case a value of one is assumed. If the number of atoms is negative, the
value of NVal is irrelevant and can be omitted.
Densities, the norm of the density gradient, the Laplacian of the density,
and the potential are scalars (i.e. one value per point, NVal=1).
A gradient cube contains the density plus the vector gradient of the density,
so it has four values per point (NVal=4): i.e. the value of the density plus
the X, Y, and Z components of its gradient. Similarly, an electric field
cube contains the potential plus the electric field vector, so it has
four values per point (NVal=4).
The next three lines give the number of voxels along each axis (N_{x}, N_{y}, N_{z}) followed by the axis vectors X = (X_{x}, X_{y}, X_{z}), Y = (Y_{x}, Y_{y}, Y_{z}). Z = (Z_{x}, Z_{y}, Z_{z}). Note this means the volume need not be aligned with the coordinate axis, indeed it also means it may be sheared although most volumetric packages won't support that. Really, many tools like PAMoC only support voxel axes that are aligned with the geometry axes (and thus are also orthogonal). In this case the component values X_{x}, Y_{y}, and Z_{z} will be positive and the other six (X_{y}, X_{z}, Y_{x}, Y_{z}, Z_{x}, and Z_{y}) will be identically zero.
The length of each vector is the length of the side of the voxel thus
allowing non cubic volumes.
If the sign of the number of voxels in a dimension is positive then the units
are Bohr, if negative then Angstroms. As noted in the official Gaussian
documentation,[L2“cubegen.”] in a CUBE file
distance units must always be in Bohrs, and thus the ‘units flag’
function of a negative sign is superfluous. It is prudent to design applications
to handle gracefully a negative value here, however.
And this is what is done in PAMoC.
The penultimate section in the header is one line for each atom,
consisting of 5 numbers, the first is the atomic number, the second
is the nuclear charge of the atom, the last three are the X_{i},
Y_{i}, Z_{i} coordinates of the atom in the geometric
frame of reference (X_{0}, Y_{0}, Z_{0}).
As many applications do not use the nuclear charge of the atom, the
corresponding field could be omitted. So, PAMoC counts the fields on
the line and assigns the values contained in the last three fields to
the atom coordinates.
The last section in the header is only present if NAtoms is negative. This section comprises one or more rows of integers (written with format 10I5), representing identifiers associated with multiple data values at each voxel, with a total of M + 1 values present. The most common meaning of these identifiers is orbital indices, in CUBE files containing wavefunction data. The first value must be positive and equal to M, to indicate the length of the rest of the list. Each of these M values may be any integer, with the constraint that all values should be unique. Further, all M values should be non-negative, as unpredictable behavior may result in some applications if negative integers are provided. It is clear that when this section is present the value of M replacess that of NVal (i.e. NVal = M).
The original Gaussian format arranged numerical integer values in a 5-digit wide fields (format I5) and floating point values in a 12-character wide field with 6 decimal places (format F12.6). For example, schematically, the header section of CUBE files will look like this:
1st comment line 2nd comment line N_{Atoms}, X_{0}, Y_{0}, Z_{0}, N_{Val} N_{x}, X_{x}, Y_{x}, Z_{x} N_{y}, X_{y}, Y_{y}, Z_{y} N_{z}, X_{z}, Y_{z}, Z_{z} IA_{1}, Chg_{1}, X_{1}, Y_{1}, Z_{1} ... ... ... ... ... IA_{n}, Chg_{n}, X_{n}, Y_{n}, Z_{n} M, m_{1}, m_{2}, m_{3}, m_{4}, m_{5}, m_{6}, m_{7}, m_{8}, m_{9} m_{10},m_{11},m_{12}, ..., m_{M } |
(A) (A) _{ }(I5,3F12.6,I5) _{ }(I5,3F12.6) _{ }(I5,3F12.6) _{ }(I5,3F12.6) _{ }(I5,4F12.6) (I5,4F12.6) _{ }(I5,4F12.6) _{ }(10I5) _{ }(10I5) |
_{ }number of atoms, origin of the cube, number of voxels _{ }number of increments in the slowest running direction _{ }number of increments in the intermediate running direction _{ }number of increments in the fastest running direction _{ }atomic number, charge, and coordinates of the first atom ... ... ... ... ... _{ }atomic number, charge, and coordinates of the last atom _{ }number of MOs and their sequence numbers _{ } |
This section consists of one floating point number for each volumetric element. The original Gaussian format arranged the values one row per record (i.e., N_{x} × N_{y} records each of length N_{z} × NVal). Each row is written out in format (6E13.5). In this case, if N_{z} × NVal is not a multiple of six, then there may be blank space in some lines. However, most parsing programs, including PAMoC, can read any white space separated format. Traditionally the grid is arranged with the X axis as the outer loop and the Z axis as the inner loop. A total of NX × NY × NZ × NVAL values should be present, flattened as follows (in the below Python pseudocode the for-loop variables are iterated starting from zero):[L5Skinn, B. “Gaussian CUBE File Format”]
for i in range(NX): for j in range(NY): for k in range(NZ): for l in range({NVAL}): write(data_array[i, j, k, l]) if (k*{NVAL} + l) mod 6 == 5: write('\n') write('\n') |
As illustrated above a newline is typically inserted after the block of data corresponding to each (X_{i},Y_{j}) pair.
Instead, the following Fortran loop is used by the Gaussian utility “cubegen” to store the values of an array dimensioned F(NVal,NZ,NY,NX) into a CUBE file:[L2“cubegen.”]
Do 10 IX = 1, NX Do 10 IY = 1, NY Write(n,'(6E13.5)') ((F(IVal,IZ,IY,IX),IVal=1,NVal),IZ=1,NZ) 10 Continue |
where n is the unit number corresponding to the CUBE file. It is clear from this example that each record is NVal * NZ long and has the values for all properties (e.g. orbitals) at each point together.
Older versions of the Gaussian XX manual (e.g. XX = 94 and 98)[L8“Gaussian 98 Cube Keyword”] report that “the output is similar if the gradient or gradient and Laplacian of the charge density are also requested, except that in these cases there are two or three records, respectively, written for each pair of IX, IY values. Thus, if the density D(IZ,IY,IX), gradient G(3,IZ,IY,IX), and laplacian D2(IZ,IY,IX) are to be written out on CUBE file, a correct set of Fortran loops would be”:
Do 10 IX = 1, NX Do 10 IY = 1, NY Write(n,'(6F13.5)') (D(IZ,IY,IX),IZ=1,NZ) Write(n,'(6F13.5)') ((G(IXYZ,IZ,IY,IX),IXYZ=1,3), IZ=1,NZ) Write(n,'(6F13.5)') (D2(IZ,IY,IX),IZ=1,NZ) 10 Continue |
where again n is the unit number corresponding to the CUBE file.
If the origin is (X_{0},Y_{0},Z_{0}), and the increments (X_{x},X_{y},X_{z}), (Y_{x},Y_{y},Y_{z}), and (Z_{x},Z_{y},Z_{z}), then point (IX,IY,IZ) has the coordinates:
|P_{x}| |X_{0}| |X_{x} X_{y} X_{z}| |IX−1| P_{x} = X_{0} + (IX-1)*X_{x} + (IY-1)*X_{y} + (IZ-1)*X_{z} |P_{y}| = |Y_{0}| + |Y_{x} Y_{y} Y_{z}| * |IY−1| or P_{y} = Y_{0} + (IX-1)*Y_{x} + (IY-1)*Y_{y} + (IZ-1)*Y_{z} |P_{z}| |Z_{0}| |Z_{x} z_{y} Z_{z}| |IZ−1| P_{z} = Z_{0} + (IX-1)*Z_{x} + (IY-1)*z_{y} + (IZ-1)*Z_{z} |
Although it has not become a standard, the Gaussian CUBE file format is recognized by many applications and therefore several quantomechanical codes offer the option of generating a CUBE file in this format. However most CUBE file generators adopt a simplified structure, which does not allow multi-property files and implies NVal = 1. In addition, some of them only generate grids with orthogonal axes (i.e. rectangular parallelepipeds or cuboids). This is the case for instance of NWCHEM-DPLOT.
As far as the format of a gradient or gradient and Laplacian Gaussian CUBE file is concerned, there seems to be a problem of incompatibility between the format described in the latest version of the Gaussian XX manual (XX = 09),[L2“cubegen.”] which requires writing a single record long 4×NZ or 5×NZ for each pair of IX, IY values, and older ones (XX = 94 and XX = 98),[L8“Gaussian 98 Cube Keyword”] which instead require writing two or three records long NZ, 3×NZ, and NZ, respectively. In agreement with the most recent official specification,[L2“cubegen.”] a correct set of Fortran loops would be:
Do 10 IX = 1, NX Do 10 IY = 1, NY Write(n,'(6F13.5)') (D(IZ,IY,IX), (G(IXYZ,IZ,IY,IX),IXYZ=1,3), $ D2(IZ,IY,IX), IZ=1,NZ) 10 Continue |
Gaussian includes a standalone utility for generating cubes from the data in a formatted checkpoint file. The utility is named cubegen, and it is described in the website of the Gaussian, Inc.[L2“cubegen”]
NWChem[2M. Valiev, E.J. Bylaska, N. Govind,
K. Kowalski, T.P. Straatsma, H.J.J. van Dam,
D. Wang, J. Nieplocha,
E. Apra, T.L. Windus, W.A. de Jong
Comput. Phys. Commun.
2010, 181, 1477-1489., L9NWChem: Open Source High-Performance Computational
Chemistry] provides the “task DPLOT” directive which,
combined with the “GAUSSIAN” sub-directive, generates volumetric
data in the Gaussian Cube format.[L10NWChem: DPLOT]
GAMESS[3M.W.Schmidt, K.K.Baldridge,
J.A.Boatz, S.T.Elbert, M.S.Gordon, J.H.Jensen,
S.Koseki, N.Matsunaga,
K.A.Nguyen, S.Su, T.L.Windus, M.Dupuis, J.A.Montgomery
J. Comput. Chem. 1993, 14, 1347-1363.,
4M.S.Gordon, M.W.Schmidt
in
“Theory and Applications of Computational Chemistry: the first forty
years”
pp. 1167-1189, Elsevier, Amsterdam, 2005.,
L11GAMESS: General Atomic and Molecular
Electronic Structure System] can generate a cube file by means of
the “$GRID” group of instructions in the input section:[L12GAMESS online documentation:
Input
Description]
$GRID group (not required) This group is used to input a grid (plane or cube) on which properties will be calculated. This group should be given if WHERE=GRID in $ELPOT or $ELDENS. This output will be in the PUNCH file whenever OUTPUT=PUNCH or BOTH. MODGRD = 0 orthonormalize the grid vectors = 1 normalize the grid vectors ORIGIN(i) = coordinates of one corner of the grid/cube. XVEC(i) = vector from ORIGIN to an adjacent corner "X" of the grid (or cube). The XVEC direction need not be parallel to the X-axis of the molecule. YVEC(i) = vector to the adjacent corner "Y" of grid/cube. ZVEC(i) = vector to the adjacent corner "Z" of the cube, given if and only if MODGRD=1. SIZE = grid increment in all directions (default 0.25) UNITS = units of the above five values, it can be either ANGS (the default) or BOHR. GRDPAD = grid padding, like GRDPAD in $FMOPRP, but applied to non-FMO runs. Default: 0, which means padding is not used so one must specify ORIGIN. Two dimensional grids may be drawn with the graphics program MEPMAP provided with GAMESS. Several programs will accept the 3-dimensional CUBE format. Note that MacMolPlt draws 2D and 3D density maps without any need to pre-compute them inside GAMESS by this group. User *must* input orthogonal XVEC/YVEC/ZVEC directions! |
If run through GAMESS, you'll get the usual output file, but more
importantly in your user scratch space you'll get a “.dat” file too.
This file contains the CUBE data you requested. Just search for START
to locate where the CUBE file starts, extract that and change the
file extension to “.cub” or “.cube”. It is important to
note that GAMESS writes out the volumetric data as a single record, so
that it does not conform to the standard format of the Gaussian utility
“cubegen”.[L2“cubegen”] It follows, for example,
that a GAMESS cube file can be
read by VESTA[5K. Momma and F. Izumi
J. Appl. Crystallogr. 2011, 44, 1272–1276.,
L13VESTA: Visualization for Electronic and
STructural Analysis] and PAMoC, but not by Multiwfn.[6Tian Lu, Feiwu Chen
J. Comp. Chem.
2012, 33, 580-592., L14Multiwfn: A Multifunctional Wavefunction
Analyzer]
An example of input instructions to generate a cube file with GAMESS is
reported in the scheme below.
$ELDENS IEDEN=1 MORB=0 WHERE=GRID IEDINT=0 $END $GRID MODGRD=1 SIZE=0.21549 UNITS=BOHR ORIGIN(1)= -5.13431, -3.16585, -7.56302 XVEC(1)= 8.44175 -3.16585 -7.56302 YVEC(1)= -5.13431, 8.47077, -7.56302 ZVEC(1)= -5.13431 -3.16585 5.36656 $END |
Running through GAMESS the input instructions reported in the left box, yields
the cube definition shown in the right box.
Please, note that the axis vectors in the left box were defined as
XVEC(1) = Origin(1) + XLen XVEC(2) = Origin(2) XVEC(3) = Origin(3) YVEC(1) = Origin(1) YVEC(2) = Origin(2) + YLen YVEC(3) = Origin(3) ZVEC(1) = Origin(1) ZVEC(2) = Origin(2) ZVEC(3) = Origin(3) + ZLenwhere XLen, YLen, and ZLen are the lengths of the edges of the cuboid. |
GAMESS CUBE FORMAT: ELECTRON DENSITY OUTER LOOP: X, MIDDLE LOOP: Y, INNER LOOP: Z 10 -5.134310 -3.165850 -7.563020 64 0.215490 0.000000 0.000000 55 0.000000 0.215490 0.000000 61 0.000000 0.000000 0.215490 |
The example above requires calculation of total electron density values over a three dimensional grid, because “MORB=0” has been set in “$ELDENS”. A value of “MORB” greater than zero specifies the molecular orbital whose electron density is to be computed. Similarly, a cube of electrostatic potential values can be obtained by “$ELPOT IEPOT=1 WHERE=GRID END”.
ORCA[7Neese, F.
Wiley Interdiscip.
Rev.: Comput. Mol. Sci. 2012, 2, 73-78.,
L15ORCA: An ab initio, DFT and semiempirical
SCF-MO package.] can create a “.cube” file of the
electron density by means of the “%plots” input block,[L16ORCA manual - Version 4.0.1]
as shown in the example below:
%plots #*** resolution of the cube dim1 80 # resolution in x-direction dim2 80 # resolution in y-direction dim3 80 # resolution in z-direction #*** minimum and maximum values along axis directions min1 0.0 # x-min value in bohr max1 0.0 # x-min value in bohr min2 0.0 # y-min value in bohr max2 0.0 # y-max value in bohr min3 0.0 # z-min value in bohr max3 0.0 # z-max value in bohr #*** the format of the output file Format Gaussian_Cube # Gaussian-cube format # (an ASCII file) #*** the quantities to output MO("gly-RHF-HOMO.cub",19,0); # Molecular orbital nr. 19 # (counting starts at orbital 0) ElDens("gly-RHF-density.cub"); # Electron density SpinDens("gly-RHF-spindensity.cub"); # Spin density end |
In this example the dimensions of the cube were not given explicitly (i.e. min1 = max1 = min2 = max2 = min3 = max3=0), so that the program will try to be smart and figure out a good cube size by itself. It will look at the minimum and maximum values of the coordinates and then add 7 bohrs to each dimension in the hope to properly catch all wavefunction tails.
The quantities that can be calculated are the atomic orbitals, molecular
orbitals, natural orbitals, the total electron density or the total spin
density. However, an electrostatic potential “.cube” file can
be created by using a python script[L17Marius
Retegan
mep.py: Create a .cube file of the electrostatic potential using
ORCA] written by Marius Retegan.[L18Marius Retegan's personal web page]
In this case the keyword “KeepDens” must be specified in
the input deck in order to preserve the “.scfp” file from being
deleted, together with the keyword “XYZFile” to create
the “.xyz” file. Both the “.scfp” and the
“.xyz” files are required by the python script, in addition to the
“.gbw” file.
PSI4[8R. M. Parrish, et al.
J. Chem. Theory Comput., 2017, 13(7),
3185–3197.,
9J. M. Turney, et al.
WIREs Comput. Mol. Sci. 2912, 2, 556.,
L19PSI4: open-source quantum chemistry.
] can evaluate properties on a grid and generate cube files by
“cubeprop()”.[L20PSI4 manual:
Generation of Cube Files − cubeprop()]
The CRYSTAL[10Erba, A.; Baima, J.;
Bush, I.; Orlando, R.; Dovesi, R.
J. Chem. Theory Comput.
2017, 13(10), 5019-5027.,
11Dovesi, R.; Saunders, V. R.; Roetti, C.;
Orlando, R.; Zicovich-Wilson, C. M.; Pascale, F.; Civalleri, B.; Doll, K.;
Harrison, N. M.; Bush, I. J.; D’Arco, P.; Llunell, M.; Causà, M.; Noël, Y.;
Maschio, L.; Erba, A.; Rerat, M.; Casassa, S.
CRYSTAL17 User's Manual.
University of Torino: Torino, 2017.,
L21CRYSTAL: a computational tool for solid
state chemistry and physics] code can evaluate cube files of the
electronic charge density (“ECH3”) and the electrostatic
potential (“POT3”), as described by a specific
tutorial[L22F. Chiatti
Isodensity
Surface Colorcoded with the Electrostatic Potential.]
on the CRYSTAL tutorials website.[L22CRYSTAL
tutorials web site.]