Abstract
Are the currently used reference methods to approximately solve the many-electron Schrödinger equation accurate enough? Here, we investigate recently reported discrepancies of noncovalent interaction energies for large molecules predicted by two of the most widely-trusted many-electron theories: diffusion quantum Monte Carlo and coupled-cluster theory. We are able to unequivocally pin down the main source of the puzzling discrepancies and present modifications to widely-used coupled-cluster methods needed for more accurate noncovalent interaction energies of large molecules on the hundred-atom scale. This is of critical impact for a wide range of applications which rely on highly-accurate interaction energies between large and polarizable molecules.
Similar content being viewed by others
Introduction
The calculation of electronic transition energies for a single hydrogen atom in 1926 marks the beginning of an incredible successful era of quantum mechanics1,2. Shortly after this breakthrough, Paul Dirac famously noted that the underlying physical laws necessary for much of physics and all of chemistry are now completely known. However, he also pointed out that the exact application of these laws leads to equations that are much too complicated to solve3. This paradigm of theoretical chemistry and physics is prevalent until today. In particular, the exponential growth of the computational complexity of the many-electron problem with system size still makes an exact solution of the electronic Schrödinger equation for more than a few atoms impossible. As a consequence, a hierarchy of increasingly accurate methods that is capable of producing reference results at the expense of tractable yet high computational cost has emerged. These reference results are pivotal in order to develop, assess, and further improve computationally more efficient but in general less accurate approximations. In this context, a prime example was the numerical prediction of highly accurate ground state energies for the uniform electron gas using diffusion Monte Carlo methods, leveraging the development of approximate exchange-correlation functionals that ultimately led to the breakthrough of density functional theory in computational materials science during the last decades4,5,6.
At present, quantum mechanical many-electron calculations of systems containing more than 100 atoms have become possible thanks to methodological developments and considerable growth in computing power. These methodological improvements are often based on taking advantage of the relative nearsightedness of many-electron correlation effects7,8,9. In this manner, the scaling of the computational complexity with respect to system size can be lowered. However, recently several works10,11,12,13 showed that there exist alarming discrepancies between predicted interaction energies for large molecules when using two of the most widely-trusted highly accurate many-electron theories: DMC and CCSD(T), which stand for diffusion Monte Carlo and coupled-cluster theory using single, double, and perturbative triple particle-hole excitation operators, respectively. These observations are a source of great concern in the electronic structure theory community because, in the case of noncovalent interactions between molecular complexes, both CCSD(T) and DMC are considered highly reliable benchmark methods14,15,16. Furthermore, the observed discrepancies are large enough to cause qualitative differences in calculated properties of materials, which can have scientific and technological implications. For example, accurate crystal structure predictions are crucial in drug design17,18,19. Similarly, reliable reference methods are essential for discovering and designing new functional materials for applications such as renewable energy storage and conversion, including catalysis, or solar cells20,21,22. Finally, as machine learning increasingly pervades all areas of computational first-principles physics, the accuracy of these reference methods, which provide the training data, becomes even more critical23,24,25.
In the following, we analyze a set of large molecular systems where large discrepancies between approximated versions of DMC and CCSD(T) were observed10,11. Importantly, a direct experimental measurement of the computed interaction energies of these systems is complicated and prone to significant uncertainties. Therefore, it is an open challenge to identify the origin of the observed deviations for the employed highly accurate yet approximate theoretical approaches.
Results
We present an approach which allows us to unambiguously test if the employed approximations for DMC and CCSD(T) cause the puzzling discrepancies between their predictions. In particular, our methodology exhibits three striking advantages. Firstly, due to its efficient and massive computational parallelization, we omit any local correlation approximation, as was employed for the CCSD(T) calculations in refs. 11,12,13. Secondly, we use a plane wave basis set to enable an unbiased assessment of the quality of previously employed tabulated atom-centered basis functions. Thirdly, we are able to study the influence of higher-order contributions to the many-electron perturbation expansion beyond CCSD(T) theory for large molecular complexes.
In order to demonstrate the reliability of our plane wave basis approach, we first investigate the parallel displaced benzene dimer as a benchmark. We find that our approach effectively addresses the challenges of noncovalent interactions between large molecules, combining the compactness and systematic improvability of natural orbitals without near-linear dependencies that plague atom-centered Gaussian basis sets for densely packed structures. As discussed in the supplementary information, our computed CCSD(T) interaction energies for the parallel displaced benzene dimer are in excellent agreement with Gaussian basis set results. Our fully converged estimate of the CCSD(T) interaction energy is -2.62kcal/mol, which is in excellent agreement with results obtained using Gaussian basis sets of -2.70 kcal/mol. We note that these results have been extrapolated to the complete basis set limit and in the case of our plane wave basis calculations also to the infinite box size limit.
Next, we turn to the parallel displaced coronene dimer (C2C2PD) interaction energy, where significant discrepancies between DMC and CCSD(T) have been observed10,11. A comparison to published second-order Møller-Plesset perturbation theory (MP2) interaction energies in Table 1 for C2C2PD reveals that our plane wave approach yields reliable results. Computational details about the basis set convergence are summarized in the supplementary information. As shown in Table 1, our canonical CCSD(T) estimates for the parallel displaced coronene dimer align closely with domain-based local pair-natural orbital (DLPNO-CCSD(T)), explicitly correlated pair-natural orbital (PNO-LCCSD(T)-F12) and local natural orbital (LNO-CCSD(T)) results, ruling out basis set incompleteness and local approximation errors as sources of discrepancies with DMC findings. Although our CCSD(T) calculations require about 100k CPU hours, the main purpose of these calculations is to serve as a valuable reference for computationally faster techniques whose approximations need to be checked carefully. It is noteworthy that the CCSD(T) interaction energy contains a large (T) contribution of about −8 kcal/mol, indicating that the correct treatment of triple particle-hole excitation effects for the electronic correlation plays a crucial role.
All that glitters is not gold: overcorrelation in CCSD(T)
Having ruled out errors from local approximations and incomplete basis sets for the parallel displaced coronene dimer (C2C2PD), we seek to assess the (T) approximation, which contributes significantly to the interaction energy of C2C2PD. In passing we anticipate that the (T) contribution to the interaction energy is also relatively large for all other systems with a significant discrepancy reported in ref. 11 (see supplementary information).
The (T) approximation was introduced in the seminal work by Raghavachari et al.26. Since then, it has become one of the most widely-used benchmark methods—sometimes referred to as the ‘gold standard’ of molecular quantum chemistry–for weakly correlated systems. However, we argue that the partly significantly too strong interaction energies in CCSD(T) theory are caused by the employed truncation of the approximation of the triple particle-hole excitation operator. These shortcomings are comparable to the issue of too strong interaction energies from truncated perturbation theories for systems with large polarizabilities, as discussed by Nguyen et al.27. As can be observed for the coronene dimer in Table 1, second-order Møller-Plesset perturbation theory (MP2)—a truncated pertubation theory—exhibits this overestimation of the interaction energy. In the extreme case of an infinite polarizability, as it occurs in metallic systems, MP2 and CCSD(T) even yield divergent correlation energies in the thermodynamic limit, which is referred to as infrared catastrophe28,29. In contrast, a resummation of certain terms to infinite order can yield interaction energies with an accuracy that is less dependent on the polarizability. Prominent examples for such approaches include the CCSD theory as well as the random-phase approximation. We have recently presented a method, denoted as CCSD(cT), that averts the infrared catastrophe of CCSD(T) by including selected higher-order terms in the triples amplitude approximation without significantly increasing the computational complexity29.
Understanding the discrepancy
For the present work it is important to note that the main difference between CCSD(cT) and CCSD(T) theory originates from the employed approximation to the triple particle-hole excitation amplitudes. The triple amplitudes of the (cT) approximation are given in diagrammatic and algebraic form by29

where \(\hat{V}\) and \({\hat{T}}_{2}\) stand for the Coulomb interaction and the double particle-hole excitation operator, respectively. For brevity, the contributions from the single excitation operator are not included and only one additional ‘direct’ diagram is depicted. In here, \({\Delta }_{abc}^{ijk}={\varepsilon }_{i}+{\varepsilon }_{j}+{\varepsilon }_{k}-{\varepsilon }_{a}-{\varepsilon }_{b}-{\varepsilon }_{c}\), with ε’s being one-electron HF energies. The bra- and ket-states correspond to a triple excited and reference state, respectively. The (T) approximation disregards the term \([[\hat{V},{\hat{T}}_{2}],{\hat{T}}_{2}]\), which is included in (cT) and also occurs in full CCSDT theory. This term effectively screens the bare Coulomb interaction of the \([\hat{V},{\hat{T}}_{2}]\) term and has an opposite sign, making it crucial for systems with large polarizability. However, for small and weakly polarizible systems the \([[\hat{V},{\hat{T}}_{2}],{\hat{T}}_{2}]\) contribution is small, making the (T) and (cT) approximation agree, as it was already shown for a set of small molecules29.
We now demonstrate that using CCSD(cT) instead of CCSD(T) theory restores excellent agreement for noncovalent interaction energies with DMC findings. First, we consider again the coronene dimer. Table 1 shows that the binding energy for the coronene dimer calculated on the level of CCSD(cT) theory is by almost 2 kcal/mol closer to the DMC estimate compared to CCSD(T) theory, achieving chemical accuracy (1 kcal/mol) in comparison to DMC after subtracting error bars. Next, we investigate the accuracy of CCSD(T) and CCSD(cT) compared to DMC for noncovalent interactions in smaller molecules. To this end, we study a set of dimers containing up to 24 atoms that were also investigated in ref. 11. This gives us another opportunity to assess the effect of local approximations at the level of CCSD(T) theory. Figure 1 depicts the deviations of all computed interaction energies from DMC reference values taken from ref. 11. It should be noted that DMC references and differences to LNO-CCSD(T) interaction energies are shown with error bars11. Using our massive computational parallelization approach, we are able to add canonical CCSD(T) interaction energies extrapolated to the CBS limit to the comparison to DMC. For these relatively small molecules, we can employ sufficiently large basis sets, reducing the remaining uncertainty to ~0.01 kcal/mol (see supplementary information). Importantly, our canonical CCSD(T) results are in good agreement with LNO-CCSD(T) findings to within its error bars. The only minor exception is observed for the parallel displaced uracil dimer, where canonical CCSD(T) predicts a slightly stronger interaction. A comparison to DMC reveals that CCSD(T) theory predicts on average about 0.3 kcal/mol stronger interaction energies. Based on LNO-CCSD(T) and DMC data alone such a statement cannot be made due to the relatively large and mostly overlapping error bars. However, our well converged canonical CCSD(T) findings allow drawing such conclusions. Only for the T-shaped pyridine and benzene dimers, DMC and CCSD(T) binding energies agree to within the DMC errors. Note that these systems have a smaller (T) contribution to the intereaction energy, compared to the parallel displaced systems. All other systems exhibit small but significant discrepancies between CCSD(T) and DMC results, which is consistent with the even larger discrepancies reported for the larger molecules in ref. 11. Similar to our findings for the coronene dimer reported in Table 1, Fig. 1 shows that CCSD(cT) interaction energies agree significantly better with DMC values than their CCSD(T) counterparts.
The dimers are in parallel displaced (PD) or T-shaped (TS) configurations. The CCSD(T) and CCSD(cT) values are calculated complete basis set (CBS) estimates obtained from basis set extrapolation using aug-cc-pVTZ and aug-cc-pVQZ basis sets46,47 (details in the supplementary information). The LNO-CCSD(T) and DMC results as well as their uncertainties are taken from ref. 11. The uncertainty of LNO-CCSD(T) is shown by the error bars. The uncertainty of DMC is shown by the blue area. Source data are provided as a Source Data file.
Given the good agreement between DMC and CCSD(cT) for the systems studied above, an important question to ask is if CCSD(cT) is really more accurate than CCSD(T) for noncovalent interaction energies? To answer this question we compare interaction energies of both approaches to higher-level CC methods for complexes from the S22 data set. As can be observed in Fig. 2, we find that while (T) is in good agreement with T for total energies, it overestimates interaction energies. Here T stands for the triples contribution to the correlation energy, ET = ECCSDT − ECCSD. While (cT) total energies do not match the accuracy of (T) when compared to T, this is a secondary consideration in the present work. Our primary focus is on interaction energies, where we demonstrate that (cT) significantly improves the accuracy compared to (T). As shown in Fig. 2, (cT) closely matches the T interaction energies, indicating its superior accuracy for weakly bound complexes. This effect is particularly strong for interaction energies with large triples correlation contributions. Here, additional triples contributions, neglected in the (T) model, are required to resolve the systematic overestimation of (T). In the supplementary information we present a more detailed analysis, which indicates that the most important contributions indeed come from fifth-order ring diagram contributions depicted in Eq. (1).
The total triples correlation energy contribution ET on the x-axis is compared to both differences between the (T), (cT) correlation energy contributions and ET for (a) total energies (ΔE) and (b) interaction energies (ΔEint). The inset in (a) depicts a zoom of the figure. The abscissa is represented by a gray dashed line. Source data are provided as a Source Data file. c Correlation between the ratio of (T) and (cT) with the ratio of the MP2 and CCSD correlation energy contributions to interaction energies of a set of dispersion-dominated complexes from the S22, L7 and S66 benchmark datasets48,49,50. The blue dashed line is a linear fit of the data points. Selected cases are labeled and visualized: Methane dimer, GCGC (guaninecytosine tetramer), BeBePD (benzene dimer parallel displaced), C2C2PD (coronene dimer parallel displaced). All the data are available in the supplementary information. Source data are provided as a Source Data file.
Estimating the overcorrelation of (T) for weak interactions
In summary, we have demonstrated that CCSD(cT) theory achieves excellent agreement for noncovalent interaction energies between molecular complexes compared to DMC and CCSDT theory. However, we stress that the CC series of methods (CCSD, CCSDT and CCSDTQ) is observed to yield monotonic and rapidly converging interaction energies for small and weakly bound complexes30. Based on this knowledge, we emphasize that the Q contribution to the interaction energies can be expected to be smaller than its T counterpart, but could possibly yield a significant contribution. Indeed, this is part of the reason for the success of the CCSD(T) approximation for very small molecules, where CCSD(T) is often fortuitously closer to CCSDTQ than CCSDT30. Here, we argue that this error cancellation no longer functions in the case of large molecular complexes involving strongly polarizable systems such as C2C2PD, C3GC and C60@[6]CPPA. A similar problem is known to occur in Møller-Plesset theory, where the truncation of the perturbation series also leads to significantly too strong interaction energies for systems with large polarizability, although MP2 yields relatively accurate interaction energies for systems with an intermediate polarizability27. To quantify and support the statements above, Fig. 2c illustrates that there exists a correlation between the ratio of (T) and (cT) with the ratio of the MP2 and CCSD correlation energy contributions to the interaction energies of all studied molecules in this work with dispersion-dominated interactions. This demonstrates that (T) exhibits a tendency to overestimate the absolute binding energy in a similar manner as MP2 for more polarizable systems. Although an overestimation of the (T) binding energy contribution compared to its (cT) counterpart by about 10% might yield a fortuitously better agreement between CCSD(T) and CCSDTQ, we argue that 20–30% overestimation is expected to yield significantly too strong interaction energies. For example, the values of (T)/(cT) for the Benzene-Benzene PD and coronene dimer are 1.2 and 1.3, respectively.
Having demonstrated and explained the reasons for the overestimation of absolute interaction energies on the level of CCSD(T) theory for small molecules with up to 24 atoms and the C2C2PD system, we now want to turn to the discussion of the remaining large molecular complexes where significant absolute discrepancies between DMC and CCSD(T) have been observed. These systems include C3GC from the L7 data set and the C60@[6]CPPA buckyball-ring. Here, substantial differences in the binding energies of 2.2 kcal/mol and 7.6 kcal/mol were reported, respectively after subtracting error bars. Although CCSD(cT) calculations for systems of that size are currently not feasible using our approach, we now introduce a simplified model that allows us to estimate the change in interaction energies from CCSD(T) to CCSD(cT) in an approximate manner. Given the linear trend between the different correlation energy contributions to the interaction energy depicted in Fig. 2c, it is possible to estimate the (cT) contribution for systems where only MP2, CCSD and (T) are known. These numbers can be calculated using a computationally efficient LNO-CCSD(T) implementation31,32,33,34,35,36. Results computed in this manner are denoted as CCSD(cT)-fit. Details about this procedure and the corresponding error estimates are provided in the supplementary information. Furthermore, the supplementary information includes a detailed justification of the CCSD(cT)-fit approach, which is based on extrapolation methods used to estimate the interaction energy between molecules and substrates modeled by planar molecules.
Table 2 gives our estimated CCSD(cT) interaction energies in comparison to CCSD(T) and DMC findings for seven large molecular complexes. A comparison between CCSD(cT)-fit and the explicitly calculated CCSD(cT) results for GGG, GCGC and C2C2PD shows that the linear regression model is sufficiently reliable for the systems studied in this work. Moreover, we have computed basis set converged (cT)-(T) estimates for C3A and PHE, which have been added to the LNO-CCSD(T) estimates to approximate CCSD(cT). These values and the agreement between CCSD(cT) and CCSD(cT)-fit explicitly confirm the CCSD(cT)-fit approach.
For comparison Table 2 also summarizes the DMC interaction energies from refs. 10 and 11, which agree to within at least 1 kcal/mol for GGG, C2C2PD, PHE and C3GC. For the remaining systems the DMC estimates show a larger discrepancy and for C60@[6]CPPA only one DMC estimate is available. Although the DMC binding energies have overlapping error bars, the remaining uncertainties are relatively large, illustrating that obtaining highly accurate interaction energies for these large molecules is also challenging for DMC.
As already discussed in ref. 11, CCSD(T) interaction energies listed in Table 2 exhibit large discrepancies compared to DMC for C2C2PD, C3GC and C60@[6]CPPA. In contrast, CCSD(cT)-fit resolves these discrepancies for all systems on the hundred-atom scale, achieving excellent agreement with DMC estimates of Al-Hamdani and Nagy et al.11 to within chemical accuracy (1 kcal/mol) after subtracting the error bars. Even for C60@[6]CPPA, which contains 132 atoms, a discrepancy of only 2.1 kcal/mol remains, although the error bar of CCSD(cT)-fit is relatively large in this case. We argue that the remaining discrepancies are potentially caused by uncertainties in DMC, CCSD(cT)-fit and the underlying LNO-CCSD(T) calculations. It should be noted that the error bars of LNO-CCSD(T) interaction energies are in some cases underestimated, as exemplified for the Uracil-Uracil PD dimer by the comparison between canonical CCSD(T) and LNO-CCSD(T) interaction energies shown in Fig. 1. Furthermore, the DMC interaction energy of C60@[6]CPPA has not yet been verified independently using a different DMC implementation as it was done for all other systems listed in Table 2. We also stress that in some cases the differences between the DMC estimates are larger than their respective error bars.
Discussion
Our work unequivocally demonstrates that, due to the employed truncation of the many-body perturbation series expansion, one of the most widely-used and accurate quantum chemistry approaches—CCSD(T) theory—in certain cases binds noncovalently interacting large molecular complexes too strongly. Our findings show that a simple yet efficient modification denoted as CCSD(cT) remedies these shortcomings, enabling highly reliable benchmark calculations of large molecular complexes on the hundred-atom scale that play a crucial role in scientific and technological problems, for example, drug design and surface science. We stress that the more accurate CCSD(cT) approximation can directly be transferred to computationally efficient low-scaling and local correlation approaches, which will substantially advance the applications of theoretical chemistry as well as physics in all areas of computational materials science where highly accurate benchmark results are urgently needed. We are witnessing an unremitting expansion of the frontiers of accurate electronic structure theories to ever larger systems which when combined with machine-learning techniques, has the potential to transform the paradigm of modern computational materials science.
Methods
Benchmark systems
We considered representative subsets of the L7, S22, and S66 benchmark sets of noncovalent interactions. These structures can be found in the begdb database37. Nine systems from the S66 set were selected to obtain extrapolated complete basis set (CBS) estimates, while 18 systems from the S22 set were studied at the CCSDT level of theory with modest basis sets. Furthermore, two additional groups of systems, parallel displaced polycyclic aromatic hydrocarbons, and single adenine on top of a polycyclic aromatic hydrocarbon have been studied. The benzene-benzene structures are taken from the S66 set, and the coronene dimer structure is taken from the L7 test database. All other structures not included in the abovementioned data sets, were obtained by geometry optimization using the NWChem38 version 7.0.0 with the TPSS functional and def2-tzvp basis sets. Furthermore, we include the employed structures in the SI as xyz-files.
Gaussian basis set calculations
For the S66 systems, we employed Dunning’s augmented correlation-consistent basis sets (aug-cc-pVXZ, X = T, Q, 5). CBS estimates were obtained using standard two-point extrapolation, denoted as [34] (triple-ζ and quadruple-ζ) and [45] (quadruple-ζ and quintuple-ζ). Interaction energies were defined with and without counterpoise (CP) correction. Canonical MP2, CCSD, and CCSD(T) correlation energies were obtained with the MRCC package, interfaced to our Cc4s code. In all MRCC calculations, the program’s default resolution-of-identity (RI) auxiliary basis sets were employed for both Hartree-Fock and post-HF correlation steps. We use the MRCC version released on March 18, 202231,32,33. Based on basis set convergence tests, the aug-cc-pVQZ basis set provides essentially converged Hartree-Fock energies, while the [34] extrapolation of MP2 correlation energies closely approximates the [45] CBS estimate. All of these observations hold for the counterpoise-corrected results. Given that CCSD(T) correlation energies typically converge faster with respect to basis set size than MP2, these findings justify using the [34] approach as a reliable estimate of the CBS limit for the S66 systems shown in Fig. 1.
For the S22 set, Hartree-Fock energies were computed with NWChem38 version 7.0.0 using the cc-pVDZ basis, while post-Hartree-Fock methods (MP2, CCSD, CCSDT) were carried out via Cc4s. In this case, counterpoise corrections were not applied, and the reported interaction energies are based directly on the raw supermolecular results.
For the parallel displaced polycyclic aromatic hyrdocarbons, and single adenine on top of a polycyclic aromatic hydrocarbon systems we employed aug-cc-pVTZ basis sets using MRCC. Results are given using the counterpoise correction. For the coupled-cluster calculations we used frozen natural orbitals (FNOs)39 with X = 8 − 12 virtual orbitals per occupied orbital. CCSD energies are corrected using the Delta-MP2 approach, ECCSD = ECCSD(X) + EMP2(aug − cc − pVTZ) − EMP2(X), and the (T) and (cT) energies are corrected using the (T*) method, E(T*) = E(T)(X) ⋅ EMP2(aug − cc − pVTZ)/EMP2(X)40,41.
Finally, we have calculated CBS estimates for C3A and PHE using Gaussian type orbitals with MRCC. Therefore, we took counterpoise corrected results using aug-cc-pVQZ basis set. It was shown that this choice is already sufficient for very accurate CBS estimates11. In these calculations we employ approximative FNOs: 6 and 10 for C3A and 8 and 10 for PHE, respectively. For the final CCSD energies, we once more use Delta-MP2 corrected values, defined above. For these numbers we use MP2 CBS estimates using canconical MP2 calculations, employing [45] extrapolation. Similarly (T*) correction is used, again using [45] extrapolated values for the MP2 CBS estimate.
Plane-wave workflow
For larger molecular complexes, we employed a plane-wave workflow implemented in a development version of VASP 6.3 and Cc4s. Simulation cells were enlarged systematically to remove spurious interactions between periodic images, and plane-wave cutoffs were converged at 700 eV based on MP2 tests. Canonical Hartree-Fock orbitals were computed and subsequently transformed to approximate MP2 natural orbitals39. Truncated sets of natural orbitals, characterized by the ratio of virtual to occupied states (X = Nv/No), were used both for efficient MP2 CBS corrections and as input for coupled-cluster calculations. Coulomb integrals were factorized using a singular-value decomposition of the auxiliary basis42, allowing for substantial reduction of the auxiliary dimension without loss of accuracy.
Final CCSD, CCSD(T), and CCSD(cT) calculations were performed with Cc4s on up to 50 compute nodes (128 cores each), exploiting the code’s massively parallel implementation.
Data availability
Additional data supporting the findings of this study are available within the supplementary information. The supplementary data file contains molecular structures in xyz-format used for calculations presented in S6. Source data are provided with this paper.
Code availability
The Cc4s code43 developed and used for the presented post-HF calculations can be downloaded from(https://doi.org/10.5281/zenodo.1676215844). A documentation and more information about the code can be found in ref. 43. For the underlying HF calculations we employed different codes that are referenced in the Methods section.
References
Schrödinger, E. Quantisierung als eigenwertproblem. Ann. der Phys. 384, 361–376 (1926).
Pauli, W. Über das wasserstoffspektrum vom standpunkt der neuen quantenmechanik. Z. fur Phys. 36, 336–363 (1926).
Dirac, P. A. M. Quantum mechanics of many-electron systems. Proc. R. Soc. Lond. Ser. A 123, 714–733 (1929).
Kohn, W. & Sham, L. J. Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133–A1138 (1965).
Ceperley, D. M. & Alder, B. J. Ground state of the electron gas by a stochastic method. Phys. Rev. Lett. 45, 566–569 (1980).
Jones, R. O. Density functional theory: its origins, rise to prominence, and future. Rev. Mod. Phys. 87, 897–923 (2015).
Riplinger, C.; Sandhoefer, B.; Hansen, A.; Neese, F. Natural triple excitations in local coupled cluster calculations with pair natural orbitals. J.Chem. Phys. 139, 134101(2013).
Ma, Q. & Werner, H.-J. Accurate Intermolecular interaction energies using explicitly correlated local coupled cluster methods [PNO-LCCSD(T)-F12]. J. Chem. Theory Comput 15, 1044–1052 (2019).
Szabó, P. B., Csóka, J., Kállay, M. & Nagy, P. R. Linear-scaling local natural orbital CCSD(T) approach for open-shell systems: algorithms, benchmarks, and large-scale applications. J. Chem. Theory Comput. 19, 8166–8188 (2023).
Benali, A., Shin, H. & Heinonen, O. Quantum Monte Carlo benchmarking of large noncovalent complexes in the L7 benchmark set. J. Chem. Phys. 153, 194113 (2020).
Al-Hamdani, Y. S. et al. Interactions between large molecules pose a puzzle for reference quantum mechanical methods. Nat. Commun. 12, 3927 (2021).
Ballesteros, F., Dunivan, S. & Lao, K. U. Coupled cluster benchmarks of large noncovalent complexes: the L7 dataset as well as DNA-ellipticine and buckycatcher-fullerene. J. Chem. Phys. 154, 154104 (2021).
Villot, C., Ballesteros, F., Wang, D. & Lao, K. U. Coupled cluster benchmarking of large noncovalent complexes in L7 and S12L as well as the C60 dimer, DNA-ellipticine, and HIV-indinavir. J. Phys. Chem. A 126, 4326–4341 (2022).
Řezáč, J. & Hobza, P. Benchmark calculations of interaction energies in noncovalent complexes and their applications. Chem. Rev. 116, 5038–5071 (2016).
Dubecký, M., Mitas, L. & Jurečka, P. Noncovalent interactions by quantum Monte Carlo. Chem. Rev. 116, 5188–5215 (2016).
Al-Hamdani, Y. S. & Tkatchenko, A. Understanding non-covalent interactions in larger molecular complexes from first principles. J. Chem. Phys. 150, 010901 (2019).
Firaha, D. et al. Predicting crystal form stability under real-world conditions. Nature 623, 324–328 (2023).
Hoja, J. et al. Reliable and practical computational description of molecular crystal polymorphs. Sci. Adv. 5, eaau3338 (2019).
Reilly, A. M. et al. Report on the sixth blind test of organic crystal structure prediction methods. Acta Crystallogr. Sect. B: Struct. Sci., Cryst. Eng. Mater. 72, 439–459 (2016).
Chanussot, L. et al. Open Catalyst 2020 (OC20) dataset and community challenges. ACS Catal. 11, 6059–6072 (2021).
Sauer, J. The future of computational catalysis. J. Catal. 433, 115482 (2024).
Bokdam, M., Lahnsteiner, J., Ramberger, B., Schäfer, T. & Kresse, G. Assessing density functionals using many body theory for hybrid perovskites. Phys. Rev. Lett. 119, 145501 (2017).
Donchev, A. G. et al. Quantum chemical benchmark databases of gold-standard dimer interaction energies. Sci. Data 8, 1–9 (2021).
Eastman, P. et al. SPICE, a dataset of drug-like molecules and peptides for training machine learning potentials. Sci. Data 2022 10:1 10, 1–11 (2023).
Liu, P. et al. Combining machine learning and many-body calculations: coverage-dependent adsorption of CO on Rh(111). Phys. Rev. Lett. 130, 078001 (2023).
Raghavachari, K., Trucks, G. W., Pople, J. A. & Head-Gordon, M. A fifth-order perturbation comparison of electron correlation theories. Chem. Phys. Lett. 157, 479–483 (1989).
Nguyen, B. D. et al. Divergence of many-body perturbation theory for noncovalent interactions of large molecules. J. Chem. Theory Comput. 16, 2258–2273 (2020).
Shepherd, J. J. & Grüneis, A. Many-body quantum chemistry for the electron gas: convergent perturbative theories. Phys. Rev. Lett. 110, 226401 (2013).
Masios, N., Irmler, A., Schäfer, T. & Grüneis, A. Averting the infrared catastrophe in the gold standard of quantum chemistry. Phys. Rev. Lett. 131, 186401 (2023).
Smith, D. G. A., Jankowski, P., Slawik, M., Witek, H. A. & Patkowski, K. Basis set convergence of the post-CCSD(T)) contribution to noncovalent interaction energies.J. Chem. Theory Comput. 10, 3140–3150 (2014).
Kállay, M. et al. The MRCC program system: accurate quantum chemistry from water to proteins. J. Chem. Phys. 152, 074107 (2020).
Mester, D. et al. Overview of developments in the MRCC program system. J. Phys. Chem. A 129, 2086–2107 (2025).
Kállay, M. et al. Mrcc, a quantum chemical program suite. www.mrcc.hu (2022).
Nagy, P. R. & Kállay, M. Optimization of the linear-scaling local natural orbital CCSD(T) method: redundancy-free triples correction using Laplace transform. J. Chem. Phys. 146, 214106 (2017).
Nagy, P. R., Samu, G. & Kállay, M. Optimization of the linear-scaling local natural orbital CCSD(T) method: Improved algorithm and benchmark applications. J. Chem. Theory Comput. 14, 4193 (2018).
Nagy, P. R. & Kállay, M. Approaching the basis set limit of CCSD(T) energies for large molecules with local natural orbital coupled-cluster methods. J. Chem. Theory Comput. 15, 5275 (2019).
The Benchmark Energy & Geometry Database (BEGDB). http://www.begdb.org/ (2024).
Valiev, M. et al. NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations. Comput. Phys. Commun. 181, 1477–1489 (2010).
Grüneis, A. et al. Natural orbitals for wave function based correlated calculations using a plane wave basis set. J. Chem. Theory Comput. 7, 2780–2785 (2011).
Knizia, G.; Adler, T. B.; Werner, H. J. Simplified CCSD(T)-F12 methods: theory and benchmarks. J.Chem. Phys. 130, 054104 (2009).
Irmler, A., Gallo, A. & Grüneis, A. Focal-point approach with pair-specific cusp correction for coupled-cluster theory. J. Chem. Phys. 154, 234103 (2021).
Hummel, F., Tsatsoulis, T. & Grüneis, A. Low rank factorization of the Coulomb integrals for periodic coupled cluster theory. J. Chem. Phys. 146, 124105 (2017).
CC4S developer team CC4S user manual. https://manuals.cc4s.org/user-manual/, (2024).
Grüneis, A.; Irmler, A.; Schäfer, T.; Gallo, A. Coupled Cluster for Solids (CC4S). https://doi.org/10.5281/zenodo.16762159 (2025).
Ma, Q. & Werner, H.-J. Explicitly correlated local coupled-cluster methods using pair natural orbitals. WIREs Computational Mol. Sci. 8, e1371 (2018).
Dunning Jr, T. H. Gaussian basis sets for use in correlated molecular calculations. i. the atoms boron through neon and hydrogen. J. Chem. Phys. 90, 1007–1023 (1989).
Kendall, R. A., Dunning, T. H. & Harrison, R. J. Electron affinities of the first-row atoms revisited. systematic basis sets and wave functions. J. Chem. Phys. 96, 6796–6806 (1992).
Jurečka, P., Šponer, J., Černý, J. & Hobza, P. Benchmark database of accurate (MP2 and CCSD(T)) complete basis set limit) interaction energies of small model complexes, DNA base pairs, and amino acid pairs. Phys. Chem. Chem. Phys. 8, 1985–1993 (2006).
Sedlak, R. et al. Accuracy of quantum chemical methods for large noncovalent complexes. J. Chem. Theory Comput. 9, 3364–3374 (2013).
Řezáč, J., Riley, K. E. & Hobza, P. S66: a well-balanced database of benchmark interaction energies relevant to biomolecular structures. J. Chem. Theory Comput. 7, 2427–2438 (2011).
Acknowledgements
We gratefully acknowledge fruitful discussions on the post-CCSD(T) contributions to the interaction energies of large molecules with Adrian Daniel Boese. We also thank Péter Nagy for estimating the CCSD(cT)-fit result of C60@[6]CPPA based on raw data of ref. 11. Tobias Schäfer acknowledges support from the Austrian Science Fund (FWF) [DOI: 10.55776/ESP335]. The computational results have been achieved using the Austrian Scientific Computing (ASC) infrastructure. Andreas Irmler and Alejandro Gallo acknowledge support from the European Union’s Horizon 2020 research and innovation program under Grant Agreement No. 951786 (The NOMAD CoE). Support and funding from the European Research Council (ERC) (Grant Agreement No. 101087184) is gratefully acknowledged.
Author information
Authors and Affiliations
Contributions
T.S. and A.I. contributed equally to this work. Major numerical investigations were conducted by T.S. and A.I. alike. The work has been conceptualized by A.Gr. with additional contribution from A.I., T.S., and A.Ga. Software development has been carried out by A.I. and A.Ga. The original draft of the paper was written by A. Gr. Additional review and editing of the paper were undertaken by all authors.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Schäfer, T., Irmler, A., Gallo, A. et al. Understanding discrepancies in noncovalent interaction energies from wavefunction theories for large molecules. Nat Commun 16, 9108 (2025). https://doi.org/10.1038/s41467-025-64104-8
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-025-64104-8




