Fig. 1

Principal component analysis (PCA) and regression (PCR). The formation energy of species \(j\) on metal \(i\) is obtained from DFT and Eqs. (1)-(2). These energies are grouped in thermochemistry matrices, in red, and are approximated following PCA or PCR, in green. PCR was validated by leaving-one-metal-out of the data matrix (L1O). Variables associated with metals and species are shown in blue and orange, respectively. In black, variables associated with mathematical procedures. A data flow diagram, including the sizes of all matrices in this study, is shown in Supplementary Fig. 3