Bayesian Error Estimation in DFT simulations

Motivation

Density functional theory (DFT) has become the most widespread framework to study materials
from a fully quantum-mechanical perspective due to the favourable trade-off between accuracy and computational cost it provides. Even though the theory in principle is exact, it is commonly implemented with the use of approximations, some because of computational reasons, such as the Born–Oppenheimer approximation, and some because the exact term is not known, as is the case with the exchange–correlation (XC) energy. Even though the accuracy of the different approximations has been tested in many fields, the error that they lead to when DFT is applied to new systems remains a concern, which limits the predictive power for new systems.

Exchange-Correlation Functional

We present the development of a new meta-GGA exchange–correlation functional with uncertainty quantification capabilities from the point of view of machine learning extending on the work of Wellendorff et al. [xx]. We use a Bayesian approach for the determination of the regression coefficients with a relevance vector machine, which automatically selects the most relevant terms and drops the rest, which avoids the possibly very large number of terms in the linear model and helps avoid overfitting

To specify a DFT XC approximation, we have to provide two models for the exchange and correlation functionals, Ex[n; ξ x] and Ec [n; ξ c], where ξ x and ξ c are two sets of parameters which determine the XC model [xx], and can be determined either empirically, fitting them to experimental data [xx], or from theoretical considerations [xx]. Putting these parameters explicitly, the DFT energy functional is then

EDFT [n; ξ x, ξ c] = Eb[n] + Ex[n; ξ x] + Ec[n; ξ c]

We focus on the exchange contribution only, taking the correlation energy term from other functionals such as PBE, PBEsol, MS0 or TPSS. Therefore, to specify our exchange energy model, we will use the exchange enhancement factor. In this work we represent it as a linear model in a set of basis functions φi(s,α),
F x(s,α) =iξ xi φi(s,α) = (ξ x)T φ(s,α)

The first objective is to determine the parameter of this linear model. To do that we use a Bayesian framework so that we can get a full probability distribution for the value of each of the parameters. To train the model...

Errors in other quantities of interest

DFT simulations are use rutinarily to predict the values of a number of properties of solids and molecules. However, due to the different approximations, its predictive power is limited. We aim to improve this situation by providing predictive distributions of the quantities of interest which provide not only a value but also a confidence in the results.