Input & Output Files
Overview
There are two main input files, input.nn
and input.data
, which have to be
present in all running modes of RuNNer
. If a fit shall be restarted using a
preoptimized set of weights, additional input files weights.XXX.data
and/or
weightse.XXX.data
must be present for the short range and the electrostatic NN,
respectively. Further, if Kalman matrices of a previous fit shall be used for a
restart, the files kalman.short.XXX.data
and/or kalman.elec.XXX.data
must be available. In the prediction mode, the files scaling.data
and
weights.XXX.data
(and in case of electrostatics the corresponding files
scalinge.data
and weightse.XXX.data
) must be in the running directory.
The main output of RuNNer
in all modes is sent to standard output (the screen).
It can be redirected to a file by typing
./RuNNer.x | tee runner.out
It is custom but not obligatory to call these files mode1.out
, mode2.out
,
and mode3.out
.
Depending on the mode, a number of output files is generated. In mode 1, the construction of the symmetry functions, for the short range NN the files
function.data
,testing.data
,trainstruct.data
,teststruct.data
,trainforces.data
andtestforces.data
are written. For the electrostatic NN additionally the files
functione.data
,testinge.data
,trainforcese.data
andtestforcese.data
are generated. In the fitting mode in each epoch the weight files
scaling.data
,scalinge.data
,YYYYYY.short.XXX.out
andYYYYYY.ewald.XXX.out
.
are written.
The files optweights.XXX.out
and optweightse.out
contain the sets of weights
with the lowest overall testing error. Further, if requested, also the files
kalman.short.XXX.data
and/or kalman.elec.XXX.data
, trainpoints.YYYYYY.out
,
testpoints.YYYYYY.out
, traincharges.YYYYYY.out
, testcharges.YYYYYY.out
,
trainforces.YYYYYY.out
and testforces.YYYYYY.out
are written.
In the case of 4G-HDNNPs, not only are the weight files generated but also the
optimized hardness (YYYYYY.hardness.XXX.out
). The file opthardness.XXX.out
contains the values with the lowest overall testing error.
Friendly reminder:
Users should use a set of weight and hardness files from the same epoch or epoch with the lowest overall testing error.
Input and Output Files
XXXXXX.short.YYY.out
Mode 1: --- • Mode 2: Output • Mode 3: ---
This file is written in the fitting mode. It contains the short range NN weight parameters of epoch XXXXXX for the element of nuclear charge YYY.
XXXXXX.ewald.YYY.out
Mode 1: --- • Mode 2: Output • Mode 3: ---
This file is written in the fitting mode. It contains the electrostatic NN weight parameters of epoch XXXXXX for the element of nuclear charge YYY. In the 4G-HDNNP case, it contains the electronegativity NN weight parameters of epoch XXXXXX for the element of nuclear charge YYY.
energy.out
Mode 1: --- • Mode 2: --- • Mode 3: Output
This file contains the total energy of the system in Hartree. In case only the short range part or only the electrostatic part is used, the total energy just contains this part.
opthardness.XXX.out
Mode 1: --- • Mode 2: Output • Mode 3: Input
This file contains a single hardness value of the epoch with the smallest error
of the test set. The value is identical to the hardness in the corresponding
XXXXXX.hardness.YYY.out
file.
function.data
Mode 1: Output • Mode 2: Input • Mode 3: ---
This file contains the short range symmetry function values of all structures in
the training set and is written by RuNNer
in runner_mode
1. It is a
mandatory input file in runner_mode
2 in case of a short-range fit.
For each structure, in the first line the number of atoms in the structure is
given. Then, for each atom there is one line starting with the nuclear charge,
followed by all symmetry function values characterizing this atoms' environment.
For each structure, the final line contains the total charge, the total energy,
the short range energy and the electrostatic energy. Please note that all energy
contributions, the total energy and the total charge are normalized per atom
here. This is required because for unnormalized target quantities larger systems
would get a higher fitting weight, because they typically have a larger error.
functione.data
Mode 1: Output • Mode 2: Input • Mode 3: ---
This file contains the electrostatic symmetry function values of all
structures in the training set and is written by RuNNer
in
runner_mode
1. It is a mandatory input file in runner_mode
2 in case
of a charge fit. For each structure, in the first line the number of
atoms in the structure is given. Then, for each atom there is one line
starting with the nuclear charge, followed by all symmetry function
values characterizing this atoms' environment. For each structure, the
final line contains the total charge, the total energy, the short range
energy and the electrostatic energy. Please note that alle energy
contributions, the total energy and the total charge are normalized per
atom here. This is required because for unnormalized target quantities
larger systems would get a higher fitting weight, because they typically
have larger errors.
input.data
Mode 1: Mandatory Input • Mode 2: --- • Mode 3: Mandatory Input
The input.data
file contains one or more structures. In
runner_mode
1 the full reference data set is provided.
In runner_mode
3 all structures destined for prediction are
provided.
Each structure in the input.data
file is framed by a pair of a
begin
and an end
keyword. There can be an
arbitrary number of structures one after the other in a single input.data
file. The order of the lines in between begin
and
end
is arbitrary and each line is free-formatted.
The following information can be provided:
-
For each structure it is possible to add comment lines starting with
c
orcomment
. They can be used, for instance to label the data, give information about the settings of the electronic structure calculations (DFT code, basis set etc.) and about the author of the data. -
For periodic structures there must be three lines starting with the keyword
lattice
followed by thex
,y
, andz
coordinates of the respective lattice vectors. The unit is Bohr. -
For each atom in the system there is one line starting with the keyword
atom
, followed by three numbers specifying the Cartesian coordinates (in Bohr). Then the element symbol is given, followed by a number for the atomic charge (e.g. a Mulliken or Hirshfeld charge), the atomic energy (this is not used at the moment, please always put 0.0), and three numbers giving thex
,y
, andz
components of the atomic forces in Hartree/Bohr. -
For each structure there must be a line starting with the keyword
energy
specifying the total energy of the system in Hartree. -
For each structure there must be a line starting with the keyword
charge
specifying the total charge (in most cases 0.0, but RuNNer can also handle systems with net charge) in units of the proton charge.
Example
begin
comment This is an arbitrary comment line
lattice 10.00 0.00 0.00
lattice 0.00 10.00 0.00
lattice 0.00 0.00 10.00
atom 0.000 0.000 0.000 Zn 0.32171 0.00000 0.00000 0.00000 0.02218
atom 0.000 0.000 5.499 O -0.32172 0.00000 0.00000 -0.00000 -0.02218
energy -1854.16937000
charge 0.00000000
end
input.nn
Mode 1: Input • Mode 2: Input • Mode 3: Input
The input.nn
file is the main control file of RuNNer
. The keywords can be
given in arbitrary order, and blank lines and commented lines (starting with #)
are permitted. If keywords are not specified, reasonable defaults are assumed
where possible and written to the output for information. If an essential
keyword is missing, RuNNer
will stop with an error message and ask the user to
specify the keyword.
All keywords are documented in the reference section.
The file input.nn
is read twice, first by the subroutine getdimensions.f90
to get the dimensions of some arrays, then all input options are read by the
subroutine readinput.f90
. It contains a set of mandatory and optional
keywords, which are listed
reference section.
nnatoms.out
Mode 1: --- • Mode 2: --- • Mode 3: Output
This file contains the atomic charges and energies of the system in e and Hartree from Neural Network potential and reference method. The file contains 7 columns including Configurations (Conf.), atom id (atom), element type (element), reference charge (Ref. charge) and energy (Ref. energy), Neural Network charge (NN charge) and energy (NN energy).
nnforces.out
Mode 1: --- • Mode 2: --- • Mode 3: Optional Output
This file is written in runner_mode
3 if the keyword
calculate_forces
is used. It contains the force vectors acting on all
atoms in Ha/Bohr from Neural Network potential and reference mthod.The
file contains 8 columns including Configurations(Conf.),atom
id,Reference atomic force along x,y,z directions(Ref. \(F_{\mathrm{x}}\), Ref.
\(F_{\mathrm{y}}\), Ref. \(F_{\mathrm{z}}\)), Neural Network atomic force along x, y,
z directions (NN \(F_{\mathrm{x}}\), NN \(F_{\mathrm{y}}\), NN \(F_{\mathrm{z}}\)).
nnstress.out
Mode 1: --- • Mode 2: --- • Mode 3: Optional Output
This file is written in runner_mode
3 if the keyword
calculate_stress
is used. It contains the short range stress only, the
electrostatic contribution to the stress tensor is currently not
implemented. The stress tensor can only be calculated for periodic
systems. The file contains 4 columns including
Configurations (Conf.), Neural Network stress along x, y, z directions
(NN \(P_{\mathrm{x}}\), NN \(P_{\mathrm{y}}\), NN \(P_{\mathrm{z}}\)).
optweights.XXX.out
Mode 1: --- • Mode 2: Output • Mode 3: ---
This file contains the short range weights of the epoch with the
smallest error of the test set. The weights are identical to the weights
in the corresponding XXXXXX.short.YYY.out
file.
optweightse.XXX.out
Mode 1: --- • Mode 2: Output • Mode 3: ---
This file contains the electrostatic weights of the epoch with the
smallest error of the test set. The weights are identical to the weights
in the corresponding XXXXXX.ewald.YYY.out
file.
output.data
Mode 1: --- • Mode 2: --- • Mode 3: Output
This file is written in the prediction mode and contains all data
predicted by the NN. The format is the same as of the file input.data
.
runner.out/standard out
Mode 1: Output • Mode 2: Output • Mode 3: Output
This is the recommended name for the main output file of RuNNer
. By
default, the output is written to the standard output and needs to be
piped to runner.out
by the command
RuNNer.serial.x | tee runner.out
Alternatively, it would also be useful to name the output files of
runner_mode
1, 2
and 3 as mode1.out
, mode2.out
, and mode3.out
.
scaling.data
Mode 1: --- • Mode 2: Output & Optional Input • Mode 3: Input
This file is written during the fitting process. It contains the
minimum, maximum and average value for each symmetry function for the
short range NN. It is a mandatory input file for the prediction of
energies for new structures. In runner_mode
2 a scaling.data
file
can be read using the keyword use_old_scaling
. This can be required to
keep exactly the same fit (the file scaling.data
is part of the fit)
when restarting runner_mode
2 with a modified training set.
scalinge.data
Mode 1: --- • Mode 2: Output & Optional Input • Mode 3: Input
This file is written during the fitting process. It contains the
minimum, maximum and average value for each symmetry function for the
electrostatic NN. It is a mandatory input file for the prediction of
energies for new structures. In runner_mode
2 a scalinge.data
file
can be read using the keyword use_old_scaling
. This can be required to
keep exactly the same fit (the file scalinge.data
is part of the fit)
when restarting runner_mode
2 with a modified training set.
testcharges.XXXXXX.out
Mode 1: --- • Mode 2: Optional Output • Mode 3: ---
This file is written in runner_mode
2 for electrostatic fits (keyword
electrostatic_type 1
) and contains a comparison of the atomic charges
for DFT and the NN for each structure in the test set. A separate file
is written in each epoch.
testing.data
Mode 1: Output • Mode 2: Mandatory Input • Mode 3: ---
This file contains the symmetry function values for the test set for the
short range NN. The file is written in RuNNer mode 1, and is a mandatory
input file in the fitting mode (mode 2). The contents has the same
structure as the file function.data
.
testinge.data
Mode 1: Output • Mode 2: Mandatory Input • Mode 3: ---
This file contains the symmetry function values for the test set for the
electrostatic NN. The file is written in RuNNer mode 1, and is a
mandatory input file in the fitting mode (mode 2). The contents has the
same structure as the file functione.data
.
testpoints.out
Mode 1: --- • Mode 2: Optional Output • Mode 3: ---
This file is written in mode 2 and contains a comparison of the energies for DFT and the NN for each point in the test set. The file is updated in each epoch.
teststruct.data
Mode 1: Output • Mode 2: Mandatory Input • Mode 3: ---
This file contains the structures of the test set. The structures are needed for the calculation of the electrostatic energies. It is written in RuNNer mode 1, while the symmetry functions are calculated. In the fitting mode it is a mandatory input file. The file contains the following information:
For each structure in the training set, the first line gives the number
of that structure in the training set and a logical variable specifying
if the structure is periodic T
or non-periodic f
. For periodic
structures the following three lines contain the lattice vectors.
Further, for each atom in the structure there is one line containing the
nuclear charge, the x, y, and z positions of the atom, that atomic
partial charge, the atomic energy, and finally the x, y, and z
components of the forces. If an electrostatic NN is used (or has been
used in mode 1), then the forces are not identical to the total
reference forces, but contain only the short range forces.
traincharges.XXXXXX.out
Mode 1: --- • Mode 2: Optional Output • Mode 3: ---
This file is written in runner_mode
2 for electrostatic fits (keyword
electrostatic_type 1
) and contains a comparison of the atomic charges
for DFT and the NN for each structure in the training set. A separate
file is written in each epoch.
trainpoints.XXXXXX.out
Mode 1: --- • Mode 2: Optional Output • Mode 3: ---
This file is written in runner_mode
2 and contains a comparison of the
energies for DFT and the NN for each point in the training set. A
separate file is written in each epoch.
trainstruct.data
Mode 1: --- • Mode 2: Optional Output • Mode 3: ---
This file contains the structures of the training set. The structures
are needed for the calculation of the forces and of the electrostatic
energy. It is written in runner_mode
1, when the symmetry functions
are calculated. In runner_mode
2 it is a mandatory input file. The
file contains the following information:
For each structure in the training set, the first line gives the number
of that structure in the training set and a logical variable specifying
if the structure is periodic T
or non-periodic f
. For periodic
structures the following three lines contain the lattice vectors.
Further, for each atom in the structure there is one line containing the
nuclear charge, the x, y, and z positions of the atom, that atomic
partial charge, the atomic energy, and finally the x, y, and z
components of the forces. If an electrostatic NN is used (or has been
used in mode 1), then the forces are not identical to the total
reference forces, but contain only the short range forces.
weights.XXX.data
Mode 1: --- • Mode 2: Optional Input • Mode 3: Input
This file contains the weight parameters for the short-range NN. It has
the same format as XXXXXX.short.YYY.out
file and is usually a copy of
that file. If in runner_mode
2 a short range fit is restarted by using
the keyword use_old_weights_short
, this file must be present. In
runner_mode
3 this is a mandatory input file if a short range NN is
used.
weightse.XXX.data
Mode 1: --- • Mode 2: Optional Input • Mode 3: Input
This file contains the weight parameters for the electrostatic NN. It
has the same format as XXXXXX.ewald.YYY.out
file and is usually a copy
of that file. If in runner_mode
2 a charge fit is restarted by using
the keyword use_old_weights_charge
, this file must be present. In
runner_mode
3 this is a mandatory input file if an electrostatic NN is
used.
XXXXXX.short.YYY.out
Mode 1: --- • Mode 2: Output • Mode 3: ---
This file is written in runner_mode
2 for the short range fit. It
contains the short range NN weight parameters of epoch XXXXXX
for the
element of nuclear charge YYY
. For readability the file contains a lot
of additional information. Only the first column, which contains the
weight values, is relevant for RuNNer
. The remaining columns contain
the following information:\
-
a
orb
for weights connecting two nodes or a bias weight, respectively -
A counter for the number of the weight
-
Information on the role of the weight in the NN. in case of a weight connecting two nodes four numbers are given specifying the source layer and node as well as the target layer and node. In case of a bias weight only the target layer and node are given.
XXXXXX.ewald.YYY.out
Mode 1: --- • Mode 2: Output • Mode 3: ---
This file is written in runner_mode
2 for charge fits (keyword
electrostatic_type
1, 3, or 4). It contains the electrostatic NN weight
parameters of epoch XXXXXX
for the element of nuclear charge YYY
.
For readability the file contains a lot of additional information. Only
the first column, which contains the weight values, is relevant for
RuNNer
. The remaining columns contain the following information:
-
a
orb
for weights connecting two nodes or a bias weight, respectively -
A counter for the number of the weight
-
Information on the role of the weight in the NN. in case of a weight connecting two nodes four numbers are given specifying the source layer and node as well as the target layer and node. In case of a bias weight only the target layer and node are given.
XXXXXX.hardness.YYY.out
Mode 1: --- • Mode 2: Output • Mode 3: ---
This file is written in runner_mode
2 for charge fits (keyword
electrostatic_type
4). It contains a single hardness value of epoch XXXXXX
for the element of the nuclear charge YYY
.