4G-HDNNPs

Overview

The total energy expression of 4G-HDNNP consists of two parts, the short-range energy and long-range electrostatic energy,

\[ \begin{align} E_{\mathrm{total}} = E_{\mathrm{elec}} + E_{\mathrm{short}} \quad . \label{eq:4GTotalEnergy} \end{align} \]

The electrostatic energy is calculated from the atomic charges, which are obtained from a charge equilibration scheme (Qeq) based on environment-dependent electronegativities that are predicted by atomic neural networks. The short-range energy is a sum of atomic energies computed by atomic neural networks, which compared to 2D-HDNNPs have an additional input node providing the atomic charge. With this additional information the short-range atomic neural networks are able to accurately predict energetic changes due to modifications in the local electronic structure resulting from long-range charge transfer. Atomic forces are calculated by taking the negative analytic derivatives of the total energy in Eq. $\eqref{eq:4GTotalEnergy}$ with respect to the atomic positions.

Charge equilibration

The charge equilibration method is based on the idea that the electrons in a system are distributed in a way that minimizes the total energy. This energy does not only include the electrostatic Coulomb terms but also a term that describes the local energy caused by some amount of charge on an atom. This local energy is usually described using a Taylor series up to second order. The expansion factor for the linear term is called the electronegativity ($\chi_i$), while the second order factor is called the atomic hardness ($J_i$). The charge on each atom is Gaussian distributed with width $\sigma_i$. This results in the energy expression for the charge equilibration scheme

\[ \begin{align} E_{\mathrm{Qeq}} = E_{\mathrm{elec}} + \sum_{i=1}^{N_{\mathrm{at}}} \chi_i Q_i + \frac{1}{2} J_i Q_i^2 \quad . \end{align} \]

$N_{\mathrm{at}}$ is the number of atoms in the system. $E_{\mathrm{elec}}$ is the electrostatic energy resulting from the Gaussian charge distributions,

\[ \begin{align} E_{\mathrm{elec}} = \sum_{\mathrm{i<j}}^{N_{\mathrm{at}}} \frac{\mathrm{erf}{\frac{r_{ij}}{\sqrt{2} \gamma_{ij}}}}{r_{ij}} Q_{i} Q_{i} + \sum_{i=1}^{N_{\mathrm{at}}} \frac{Q_{i}^{2}}{2 \sigma_{i} \sqrt{\pi}} \end{align} \]

and

\[ \begin{align} \gamma_{ij} = \sqrt{\sigma_i^2 + \sigma_j^2} \quad . \end{align} \]

As only linear and quadratic terms in $Q_i$ appear in $E_{\mathrm{elec}}$ and $E_{\mathrm{Qeq}}$, they can be expressed using matrix notation

\[ \begin{align} A_{ij} = \begin{cases} \frac{1}{\sigma_i \sqrt{\pi}}, & \text{if}\ i=j \\ \frac{\mathrm{erf}{\frac{r_{ij}}{\sqrt{2} \gamma_{ij}}}}{r_{ij}}, & \text{otherwise} \end{cases} \end{align} \]

\[ \begin{align} E_{\mathrm{elec}} = \frac{1}{2} \mathbf{Q}^{\top} \mathbf{E} \mathbf{Q} \end{align} \]

and

\[ \begin{align} E_{\mathrm{Qeq}} = \frac{1}{2} \mathbf{Q}^{\top} \mathbf{A} \mathbf{Q} + \mathbf{Q}^\top{\mathbf{\chi}} \quad , \end{align} \]

with $\mathbf{Q}$ being a column vector containing the atomic charges $Q_i$, being a column vector of the electronegativities and $\mathbf{A}$ being the matrix

\[ \begin{align} A_{ij} = \begin{cases} J_i + \frac{1}{\sigma_i \sqrt{\pi}}, & \text{if}\ i=j \\ \frac{\mathrm{erf}{\frac{r_{ij}}{\sqrt{2} \gamma_{ij}}}}{r_{ij}}, & \text{otherwise} \end{cases} \quad . \end{align} \]

It has to be noted that because the total energy of a continuous charge distribution is always positive, the matrix $\mathbf{A}$ is positive definite, if the $J_i>0$. The $Q_i$ are now chosen, so that they minimize the energy $E_{\mathrm{Qeq}}$ under the additional constraint of total charge conservation.

\[ \begin{align} \sum_{i=1}^{N_{\mathrm{at}}} Q_i = Q_{\mathrm{tot}} \quad . \end{align} \]

To solve this minimization problem, we set the derivatives with respect to the charges to zero,

\[ \begin{align} \frac{dE_{\mathrm{Qeq}}}{dQ_i} = 0 \quad. \end{align} \]

Including the constraint of total charge conservation using a Lagrange multiplier $\lambda$ we end up with the system of linear equations

\[ \begin{align} \left( \begin{array}{ccc|c} & & & 1 \\ & A_{ij} & & \vdots \\ & & & 1 \\\hline 1 & ... & 1 & 0 \end{array} \right) \begin{pmatrix} Q_1 \\ \vdots \\ Q_{M} \\\hline \lambda \end{pmatrix} = \begin{pmatrix} -\chi_1 \\ \vdots \\ -\chi_M \\\hline Q_{\mathrm{tot}} \end{pmatrix} \,, \label{eq:qeq} \end{align} \]

which we can rewrite for simplicity as

\[ \begin{align} \mathbf{A'}\mathbf{Q'} = \mathbf{b} \quad , \end{align} \]

where $\mathbf{A'}$ and $\mathbf{Q'}$ represent the ($N_{\mathrm{at}}+1\times N_{\mathrm{at}}+1$) matrix and column vector in the left hand side respectively, while $\mathbf{b}$ is the column vector on the right hand side. The electronegativities $\chi_i$ are predicted by neural networks, for each atom individually, depending on the local chemical environments. The hardness values of $J_i$ are constant for a given element, and they are also optimized during the training of the neural networks.

Derivatives used for the calculation of the forces and for the neural network training

In this section we provide further details about some of the derivatives which are required for the calculation of the atomic forces as well as for the gradient-based optimization of the neural network parameters. The atomic force component $F_{r_\alpha}$ is given as the negative derivative of the energy with respect to the atomic coordinate $r_{\mathrm{\alpha}}$,

\[ \begin{align} F_{r_\alpha} = - \frac{dE_{\mathrm{total}}(\mathbf{R}, \mathbf{Q}(\mathbf{R}))} {dr_{\mathrm{\alpha}}} = \frac{\partial E_{\mathrm{total}}}{\partial r_{\mathrm{\alpha}}} - \sum_i \frac{\partial E_{\mathrm{total}}}{\partial Q_i} \frac{\partial Q_i}{\partial r_{\mathrm{\alpha}}} \quad . \end{align} \]

In this equation the partial derivatives of the atomic charges with respect to the atomic positions appear. As we will show in section , the calculation of these terms can be avoided for the determination of the forces. During the training phase, however, these derivatives as well as the derivatives of the charges w.r.t. the electronegativity and hardness are needed.

Calculation of the charge derivative with respect to the Cartesian coordinates

To calculate the $\frac{\partial Q_i}{\partial r_{\mathrm{\alpha}}}$ we take the derivative with respect to the spatial coordinate $r_{\mathrm{\alpha}}$ of the Qeq in Eq.$\eqref{eq:qeq_prime}. Reordering the terms yields

\[ \begin{align} \mathbf{A} \frac{\partial \mathbf{Q}}{\partial r_{\mathrm{\alpha}}} = \frac{-\partial \mathbf{\mathbf{\chi}}}{\partial r_{\mathrm{\alpha}}} - \frac{\partial \mathbf{A}}{\partial r_{\mathrm{\alpha}}} \mathbf{Q} \quad \end{align} \]

as well as a Lagrange multiplier that ensures $\sum_i \frac{\partial Q_i}{\partial r_\alpha} = 0$. To obtain all the 3$N_{\mathrm{at}}$ required derivatives we will have to solve 3 $N_{\mathrm{at}}$ linear equation systems of size $N_{\mathrm{at}}$+1. However, this can be avoided as explained in the following section, which allows the calculation of the total force by only solving one linear equation system.

## Calculation of the charge derivative with respect to electronegativity and hardness

With a similar procedure we can calculate the equation systems for the derivatives w.r.t. the electronegativity and hardness

\[ \begin{align} \mathbf{A} \frac{d\mathbf{Q}}{d\chi_i} = -\mathbf{\delta}_i \end{align} \]

and

\[ \begin{align} \mathbf{A} \frac{d\mathbf{Q}}{dJ_i} = -\mathbf{Q} \quad . \end{align} \]

$\mathbf{\delta}_i$ is a vector filled with zeros except entry $i$, which is one. As before, a Lagrange multiplier will be necessary, to ensure that the sum of the derivatives adds up to zero.

An efficient method for the force computation

In the last section we showed how the forces can be calculated using the partial derivatives of the charges w.r.t. the atomic coordinates. This calculation is computationally expensive, since 3 $N_{\mathrm{at}}$ linear equation systems need to be solved. This can be avoided by exploiting a method that allows the calculation of the forces by solving only one linear equation system instead.

The total energy of the 4G-HDNNP is a function of the atomic coordinates ($\V{R}$) and the charges ($\mathbf{Q}$), which also depend on the atomic coordinates,

\[ \begin{align} E_{\mathrm{total}} = E_{\mathrm{total}}(\mathbf{R}, \mathbf{Q}(\mathbf{R})) \quad. \end{align} \]

We now define an auxiliary function $L$ with

\[ \begin{align} L = E_{\mathrm{total}} + \sum_{i=1}^{N_{\mathrm{at}}+1} \lambda_i \left( \sum_{j=1}^{N_{\mathrm{at}}+1} A{'}_{ij} Q{'}_j - b_i \right) \quad . \end{align} \]

Here $\sum_{j=1}^{N_{\mathrm{at}}+1} A{'}_{ij} Q{'}_j - b_i$ are the differences of the left hand side minus the right hand sides of Eq. \eqref{eq:qeq_prime}, which were solved to determine the charges $Q_i$. These terms are therefore always zero, making $L$ equal to $E_{\mathrm{total}}$. We now choose $\mathbf{\lambda}$ such that the partial derivatives $\frac{\partial L}{\partial Q{'}_i}$ are zero,

\[ \begin{align} \frac{\partial L}{\partial Q{'}_i} = \frac{\partial E_{\mathrm{total}}}{\partial Q{'}_i} + \sum_{j=1}^{N_{\mathrm{at}}+1} A{'}_{ij} \lambda_j =0 \quad . \end{align} \]

For this we solve the linear equation system

\[ \begin{align} \sum_{j=1}^{N_{\mathrm{at}}+1} A{'}_{ij} \lambda_j = \frac{-\partial E_{\mathrm{total}}}{\partial Q{'}_i} \quad . \end{align} \]

Note that $\mathbf{A{'}}$ is a symmetric matrix. We now turn to the derivative $\frac{dL}{dr_{\alpha}}$, which is equal to $\frac{dE_{\mathrm{total}}}{dr_{\alpha}}$.

\[ \begin{align} \frac{dE_{\mathrm{total}}}{dr_{\alpha}} = \frac{dL}{dr_{\alpha}} = \frac{\partial E_{\mathrm{total}}}{\partial r_{\alpha}} + \sum_{i=1}^{N_{\mathrm{at}}+1} \frac{\partial E_{\mathrm{total}}}{\partial Q{'}_i} \frac{\partial Q{'}_i}{\partial r_{\alpha}} + \sum_{i=1}^{N_{\mathrm{at}}+1} \lambda_i \left( \sum_{j=1}^{N_{\mathrm{at}}+1} \frac{\partial {A{'}_{ij}}}{\partial r_{\alpha}} Q{'}_j + \sum_{j=1}^{N_{\mathrm{at}}+1} A{'}_{ij} \frac{\partial Q{'}_j}{\partial r_{\alpha}} - \frac{\partial b_i}{\partial r_{\alpha}} \right) \end{align} \]

Rearranging the equation yields

\[ \begin{align} \frac{dE_{\mathrm{total}}}{dr_{\alpha}} = \frac{dL}{dr_{\alpha}} = \frac{\partial E_{\mathrm{total}}}{\partial r_{\alpha}} + \sum_{i=1}^{N_{\mathrm{at}}+1} \left( \frac{\partial E_{\mathrm{total}}}{\partial Q{'}_i} + \sum_{j=1}^{N_{\mathrm{at}}+1} A{'}_{ij} \lambda_j \right) \frac{\partial Q{'}_i}{\partial r_{\mathrm{\alpha}}} + \sum_{i=1}^{N_{\mathrm{at}}+1} \lambda_i \left( \sum_{j=1}^{N_{\mathrm{at}}+1} \frac{\partial A{'}_{ij}}{\partial r_{\alpha}} Q{'}_j - \frac{\partial b_i}{\partial r_{\alpha}} \right) \quad . \end{align} \]

The term $\frac{\partial E_{\mathrm{total}}}{\partial Q{'}_i} + \sum_j A{'}_{ij} \lambda_j$ is zero by definition of $\mathbf{\lambda}$ and can therefore be omitted, which leads to the expression

\[ \begin{align} \frac{dE_{\mathrm{total}}}{dr_{\alpha}} = \frac{\partial E_{\mathrm{total}}}{\partial r_{\alpha}} + \sum_{i=1}^{N_{\mathrm{at}}+1} \lambda_i \left( \sum_{j=1}^{N_{\mathrm{at}}+1} \frac{\partial A{'}_{ij}}{\partial r_{\alpha}} Q{'}_j - \frac{\partial b_i}{\partial r_{\alpha}} \right) \quad . \end{align} \]

Charge equilibration for periodic systems

The Qeq equations for periodic boundary conditions are essentially identical to the corresponding equations for free boundary conditions, and the main difference is the calculation of the matrix $\mathbf{A}$. Because of the periodic boundary conditions we have to resort to an Ewald summation to calculate the electrostatic interaction energy.

The basic idea of Ewald summation is, that by placing Gaussian charges of the opposite sign on each of the point charges, the remaining electrostatic interaction becomes short-ranged. This short-ranged energy can then be calculated in real space ($E_{\mathrm{real}}$). We then subtract the interaction energy of the auxiliary Gaussian charges again to obtain the desired total energy of the point charges. This interaction energy of the Gaussians can be efficiently calculated in reciprocal space, resulting in the energies $E_{\mathrm{recip}}$ and $E_{\mathrm{self}}$. The electrostatic energy of $N_{\mathrm{at}}$ point charges can be hence calculated as

\[ \begin{align} E_{\mathrm{elec}} = E_{\mathrm{real}} + E_{\mathrm{recip}} + E_{\mathrm{self}} \quad . \end{align} \]

The real space part is given by

\[ \begin{align} E_{\mathrm{real}}^{\mathrm{pc}} = \frac{1}{2}\sum_{i=1}^{N_{\mathrm{at}}} \sum_{j \neq i}^{N_{\mathrm{neig}}} Q_i Q_j \frac{\mathrm{erfc} \frac{r_{ij}}{\sqrt{2} \eta}}{r_{ij}} \end{align} \]

Here, $N_{\mathrm{neig}}$ indicates, that the sum goes over all neighbouring atoms withing the real space cutoff radius $r_{\mathrm{cut}}$. $r_{ij}$ is the distance between atoms $i$ and $j$. The reciprocal space part is

\[ \begin{align} E_{\mathrm{recip}}^{\mathrm{pc}} = \frac{2 \pi}{V} \sum_{\mathbf{k}\neq0} \frac{\exp\left( \frac{-\eta^2 {\vert \mathbf{k} \vert}^2}{2} \right)}{{\vert \mathbf{k} \vert}^2} {\vert S(\mathbf{k}) \vert}^2 \end{align} \]

with

\[ \begin{align} S(\mathbf{k}) = \sum_{i=1}^{N_{\mathrm{at}}} Q_i \exp(i \mathbf{k} \cdot \mathbf{r}_i) \end{align} \]

$V$ being the volume of the unit cell and the sum going over all reciprocal lattice points inside reciprocal space cutoff radius $r_{\mathrm{cut}}^{\mathrm{recip}}$. Finally, the self-interaction correction is

\[ \begin{align} E_{\mathrm{self}}^{\mathrm{pc}} = -\sum_{i=1}^{N_{\mathrm{at}}} \frac{Q_i^2}{\sqrt{2\pi}\eta} \end{align} \]

In these equations $\eta$ is the standard deviation of the Gaussian charges, which are placed on the point charges to remove the long-range interactions.

Since we use Gaussian charge distributions for the charge equilibration process, the following terms have to be added that account for the different interaction in the short-range part as well as for the self interaction of the Gaussian charges.

\[ \begin{align} E_{\mathrm{elec}}^{\mathrm{Gauss}} = E_{\mathrm{elec}}^{\mathrm{pc}} - \frac{1}{2}\sum_{i=1}^{N_{\mathrm{at}}} \sum_{j\neq i}^{N_{\mathrm{neig}}} Q_i Q_j \frac{\mathrm{erfc}\frac{r_{ij}}{\sqrt{2} \gamma}}{r_{ij}} + \sum_{i=1}^{N_{\mathrm{at}}} \frac{Q_i^2}{2\sqrt{\pi}\sigma_i} \end{align} \]

Here $E_{\mathrm{elec}}^{\mathrm{pc}}$ is the electrostatic energy of the point charges as given above.

The important observation is that the total energy expression of the Ewald summation contains only terms of the form $\frac{1}{2} e_{ij} Q_i Q_j$. By calculating the individual coefficients $e_{ij}$ we can therefore construct the matrix $\mathbf{E}$, so that

\[ \begin{align} E_{\mathrm{elec}} = \frac{1}{2} \mathbf{Q}^\top \mathbf{E} \mathbf{Q} \quad . \end{align} \]

Including the terms for the hardness and adding the electronegativity results in a formalism equivalent to that of the Qeq method for free boundary condition.

Calculation of of the derivatives of the coefficient matrix

The differentiation of the above equation allows us to calculate the derivatives $\frac{dA_{ij}}{dr_{\alpha}}$. Explicit calculation of these derivatives however can be computationally expensive, as they are quite numerous ($3 N_{\mathrm{at}}^3$). Most of these coefficients however are zero, and the matrix is very sparse. As only the product $\sum_{j=1} \frac{dA_{ij}}{dr_{\alpha}} Q_j$ is ever needed in our computations, explicit calculation can be avoided and only the non zero terms have to be considered.

As $\sum_{j=1} \frac{dA_{ij}}{dr_{\alpha}} Q_j$ has to be calculated anyways for the efficient computations of the forces, we can also use it to calculate the electrostatic forces.

\[ \begin{align} \frac{\partial E_{\mathrm{elec}}}{\partial r_{\alpha}} = \frac{1}{2} \sum_i Q_i \left( \sum_j \frac{dA_{ij}}{dr_{\alpha}} Q_j \right) \label{eq:el-forces} \end{align} \]

Calculation of the electrostatic energies and forces boils down to the following.

Calculate symmetry functions.
Calculate environment depend electronegativities using a first set of neural networks.
Construct the matrix $\mathbf{A}$
Calculate partial charges $Q_i$ by solving the system of $N_{\mathrm{at}}+1$ linear equations (Eq. $\eqref{eq:qeq}$)
Feed the atomic charges into the atomic NNs to calculate the short range energy and forces
Use the efficient method (see above) to calculate the total force by solving one more $N_{\mathrm{at}} + 1$ dimensional linear equation system.

The only difference for the periodic case is in the step 3 and 6, where Ewald summation has to be used.

Short-range part

In the short range neural network, we also include the atomic charge via an additional input neuron, such that the atomic energy contribution also depends on global charge distributions. The expression of atomic energies is very similar to the 2G-HDNNP and it can be expressed as a function of symmetry functions and atomic charges. The atomic forces can be calculated by taking the derivatives of the energy with respect to the atomic positions

\[ \begin{align} F_{\mathrm{\alpha}} = -\sum_{i=1}^{N_{\mathrm{at}}}\frac{dE_i}{d r_{\mathrm{\alpha}}} = -(\sum_{j=1}^{N_{\mathrm{neig},i}} \sum_{k=1}^{N_{\mathrm{SF},j}} \frac{\partial E_j}{\partial G_{j,k}} \cdot \frac{\partial G_{j,k}}{\partial r_{\mathrm{\alpha}}} + \sum_{j=1}^{N_{\mathrm{at}}} \sum_{k=1}^{N_{\mathrm{at}}} \frac{\partial E_j}{\partial Q_{k}} \cdot \frac{\partial Q_{k}}{\partial r_{\mathrm{\alpha}}}) \end{align} \]

where $F_{\mathrm{\alpha}}$ and $G_{j,k}$ represent the force component $\mathrm{\alpha}$ and the $k^{\mathrm{th}}$ symmetry function of atom $j$ respectively. In addition, $N_{\mathrm{SF},i}$, $N_{\mathrm{neig},i}$ equal to number of symmetry functions and neighbors of atom $i$ and $N_{\mathrm{at}}$ the total number of atom. Note that $N_{\mathrm{neig},i}$ includes the atom $i$ itself. If the method described in section is used, the last term, which includes the partial derivatives of the atomic charges w.r.t. the atomic coordinates, can be avoided.