Introduction

Singlet fission (SF) is the conversion of one photo-generated singlet-state exciton into two triplet-state excitons1,2,3,4,5,6,7,8,9,10,11. Intermolecular SF, where the triplet-state excitons are localized on different chromophores than the singlet exciton, occurs in molecular crystals12,13,14. SF may be utilized in solar cells to exploit the excess energy of high-energy photons and reduce the energy loss due to thermalization. Harvesting two charge carriers from one photon via SF could potentially increase the power conversion efficiency of solar cells beyond the Shockley–Queisser limit15. However, commercial SF-based solar cells have yet to be realized owing to the dearth of suitable materials1,16. Certain classes of molecular materials, such as oligoacenes, oligorylenes, and their derivatives, are experimentally known to undergo SF17,18,19,20,21,22,23,24,25,26. Although 200% quantum yield and ultra-fast SF have been observed experimentally27,28, most of the known SF materials are not practical for use in commercial modules because they are chemically unstable and would degrade under operating conditions. It is therefore imperative to find new SF materials, possibly from different chemical families, in order to expand the available options. Computational exploration of the chemical space may significantly accelerate the discovery of candidates for SF in the solid state and guide experimental efforts in promising directions.

The primary criterion for SF to occur is the thermodynamic driving force. The energy difference between the initial singlet state and final state of two triplets (\(E_{{{\mathrm{S}}}} - 2E_{{{\mathrm{T}}}}\)) must be positive or at most slightly negative1,3,29,30. Organic molecular crystals that meet this requirement are rare, which explains why most known SF materials belong to restricted classes of molecules. Yet, most of the vast chemical space remains largely unexplored. Computationally efficient density functional theory (DFT) based on semi-local exchange-correlation functionals has been used extensively for high-throughput screening of materials31,32,33,34. However, DFT is a ground-state theory. Hence, it cannot directly describe the excited-state properties of chromophores that are of interest for SF. Time-dependent DFT (TDDFT) may be used to calculate the excitation energies of isolated molecules35,36. This relatively low-cost option has been adopted to screen molecules with up to 100 atoms in search of SF candidates37. However, SF-based solar cells utilize solid-state materials, i.e., molecular crystals14, whose performance depends not only on the properties of the molecular constituents but also on crystal packing38. Therefore, it is desirable to screen molecular crystals, rather than isolated molecules, in search of potential SF materials. Many-body perturbation theory (MBPT) within the GW approximation paired with the Bethe–Salpeter equation (GW + BSE) is the state-of-the-art method for predicting the excitonic properties of organic molecular crystals with periodic boundary conditions39,40,41. Using this method, we have already identified several potential candidate materials for intermolecular SF in the solid state16,29,42,43,44,45. However, the high computational cost of GW + BSE calculations is prohibitive for large-scale screening of materials databases. Therefore, it is desirable to identify descriptors that are fast to evaluate and yield models that accurately predict GW + BSE results. To this end, machine-learning (ML) algorithms for feature selection may be used.

ML is increasingly employed in conjunction with first-principles simulations for materials discovery46,47,48,49,50,51,52,53,54,55. Typically, large datasets are required to train ML models, making data acquisition the computational bottleneck. The growing availability of datasets and repositories of DFT calculations32,56,57,58,59,60,61,62 has facilitated the application of ML to the ground-state properties of materials. Applications of ML to excited-state properties are still relatively rare, owing to the high cost of data acquisition63,64,65,66. Incorporating physical and chemical knowledge may enable the construction of predictive ML models with small datasets.

Here, we employ the sure-independence-screening-and-sparsifying-operator (SISSO)67 ML algorithm to identify low-cost predictive models for SF. The input of SISSO is a set of primary features, which are physical descriptors that could be correlated with the target property. SISSO generates a huge feature space by iteratively combining the primary features using linear and nonlinear algebraic operations. Subsequently, linear regression is performed to identify the most predictive models. SISSO essentially performs a computer experiment in which hypotheses are systematically generated and tested against reference data. Physical and chemical knowledge is leveraged in the choice of primary features and in the rules for combining them. An important advantage of SISSO is that it can work well with a relatively small amount of data. It has been demonstrated in several applications that SISSO can produce predictive models with as little as a few hundred68,69,70,71, or even a few tens of training data points72. Moreover, SISSO-generated models are based on interpretable physical descriptors that may provide insight into which features are correlated with the target property69,70,71.

To train SISSO, we compile a purpose-built dataset of GW + BSE calculations of the SF driving force, \(E_{{{\mathrm{S}}}} - 2E_{{{\mathrm{T}}}}\), of 101 molecular crystals of polycyclic aromatic hydrocarbons (PAHs). Most known SF materials are PAHs, in particular acenes, rylenes, and their derivatives17,18,19,20,21,22,23,24,25,26. However, PAHs, broadly defined as compounds comprising carbon and hydrogen atoms and containing multiple aromatic rings, encompass a multitude of chemical families, which have not been explored in the context of SF. To maximize the chances of discovering new classes of SF materials, we have selected a set of PAH crystals, representing diverse chemical families. For the same set of materials, 16 physically motivated primary features are calculated. Because the properties of molecular crystals depend on both the single-molecule properties and the crystal packing in the solid state, the primary features include both single-molecule and crystal features. SISSO produces several predictive models with varying degrees of complexity. The most accurate models generated yield a training set root-mean-square error (RMSE) below 0.2 eV, which is on par with GW + BSE. Moreover, the best-performing models have a near-perfect classification accuracy for determining whether or not a given material is a promising SF candidate. Based on considerations of the model accuracy vs. the computational cost of primary feature evaluation, a hierarchical screening approach is proposed to narrow down the candidate pool. The variance between the predictions of different SISSO-generated models may be used as a measure of uncertainty. Based on the SISSO-generated models, three potential SF candidates are identified: 9-(4-biphenyl) cyclopenta[a]phenalene (BCPP), tetrabenzo[de,hi,op,st]pentacene (TBPT), and 5,6–11,12-diphenylenenaphthacene (DPNP). These compounds belong to chemical families of PAHs that have not been previously explored in the context of SF.

Results and discussion

The PAH101 dataset

Because most known SF materials are PAHs, we focus on this class of compounds to maximize the chances of discovery. In addition, restricting the chemical space means that ML models trained on small data are more likely to succeed in producing accurate predictions. A set of 101 PAH crystal structures was extracted from the Cambridge Structural Database (CSD)73. The systems in the PAH101 set represent diverse chemical families within the larger PAH class. The chromophore size in the PAH101 set ranges from 12 to 136 atoms and the crystal unit cell size ranges from 44 to 544 atoms, as shown in Fig. 1a–b . Figure 1c shows the SF driving force distribution obtained for the PAH101 set with GW + BSE, based on the Perdew-Burke-Ernzerhof (PBE)74 DFT functional, denoted as GW + BSE@PBE. We note that GW + BSE@PBE systematically underestimates the thermodynamic driving force for SF. This is partly owing to the underlying approximations and partly because additional effects, such as electron-phonon coupling75, entropic effects30, and kinetics are not considered. Therefore, we assess prospective SF candidates based on their predicted SF driving force relative to the known SF materials pentacene, tetracene, and rubrene16,29,42,43,44. Pentacene has been observed to undergo rapid SF with a 200% triplet yield27,28. SF in tetracene is slightly endoergic20,76. Rubrene is known to undergo both SF and the reverse process of triplet-triplet annihilation (TTA), where two triplet excitons are converted into one singlet exciton77,78,79,80. Therefore, we consider the GW + BSE@PBE SF driving force of rubrene, −0.62 eV, which is even lower than that of tetracene, as the lower limit for viable SF candidates. Indeed, the SF driving force of anthracene, a well-known TTA material81,82, is below that of rubrene. Thus, even if renormalization of the exciton energies due to phonons were considered, which may tilt the energy balance in favor of SF in some cases75, materials with a GW + BSE@PBE SF driving force below that of rubrene would still be unlikely to exhibit SF. The PAH101 set contains materials with a broad range of SF driving force in order for the SISSO-generated models to be able to distinguish between materials that are likely and those that are unlikely to undergo SF.

Fig. 1: Statistics of the PAH101 set.
Fig. 1: Statistics of the PAH101 set.The alternative text for this image may have been generated using AI.
Full size image

Distributions of a the number of atoms per molecule, b the number of atoms per unit cell, and c the GW+BSE@PBE SF driving force in the PAH101 dataset. SF candidates and non-SF candidates are colored in red and blue, respectively in panel c.

Primary features

The primary features are a collection of descriptors that may be physically relevant to the target property62,83, in this case, the SF driving force. The excitonic properties of molecular crystals depend on the single-molecule properties as well as the crystal packing25,38,42,84,85,86,87,88,89,90,91. Therefore, we consider single-molecule descriptors, denoted by an “S” superscript, and crystal descriptors, denoted by a “C” superscript, as primary features. For computational efficiency, the primary features are calculated at the DFT@PBE level, as described in the Methods section. For single-molecule features, we consider properties that could be correlated with the excitation energies of the chromophore, including the DFT HOMO-LUMO gap (\({{{\mathrm{Gap}}}}^{{{\mathrm{S}}}}\)), ionization potential (\({{{\mathrm{IP}}}}^{{{\mathrm{S}}}}\)), electron affinity (\({{{\mathrm{EA}}}}^{{{\mathrm{S}}}}\)), triplet-state formation energy (\(E_{{{\mathrm{T}}}}^{{{\mathrm{S}}}}\)), and the trace of the polarization tensor (\({{{\mathrm{PolarTensor}}}}^{{{\mathrm{S}}}}\)). The \({{{\mathrm{IP}}}}^{{{\mathrm{S}}}}\) and \({{{\mathrm{EA}}}}^{{{\mathrm{S}}}}\) are calculated based on DFT total energy differences between neutral and charged species. Similarly, the triplet-state formation energy is obtained from the DFT total energy difference between the triplet-state and singlet-state systems. \({{{\mathrm{PolarTensor}}}}^{{{\mathrm{S}}}}\) is calculated using the PBE exchange-correlation functional coupled with the many-body dispersion (MBD) method (PBE + MBD)92. In addition, we consider a DFT-based estimation of the thermodynamic driving force for SF, where the singlet excitation energy is approximated by the HOMO-LUMO gap and the triplet excitation energy is approximated by the triplet-state formation energy: \({{{\mathrm{DF}}}}^{{{\mathrm{S}}}} = {{{\mathrm{Gap}}}}^{{{\mathrm{S}}}} - 2E_{{{\mathrm{T}}}}^{{{\mathrm{S}}}}\).

Crystal features include the DFT bandgap (\({{{\mathrm{Gap}}}}^{{{\mathrm{C}}}}\)), the triplet-state formation energy (\(E_{{{\mathrm{T}}}}^{{{\mathrm{C}}}}\)), as well as the DFT estimate of the SF thermodynamic driving force, \({{{\mathrm{DF}}}}^{{{\mathrm{C}}}} = {{{\mathrm{Gap}}}}^{{{\mathrm{C}}}} - 2E_{{{\mathrm{T}}}}^{{{\mathrm{C}}}}\). In addition, we consider features that reflect the effect of crystal packing and the strength of coupling between neighboring molecules. The fundamental gap of a crystal is narrower than that of a single molecule owing to the combined effect of band dispersion and polarization43. Therefore, the crystal features include the valence-band dispersion (\({{{\mathrm{VB}}}}_{{{{\mathrm{disp}}}}}^{{{\mathrm{C}}}}\)) and conduction-band dispersion (\({{{\mathrm{CB}}}}_{{{{\mathrm{disp}}}}}^{{{\mathrm{C}}}}\))93, as well as the dielectric constant (\({\it{\epsilon }}^{{{\mathrm{C}}}}\)) as descriptors of the screening effect in a crystal. \({\it{\epsilon }}^{{{\mathrm{C}}}}\) is calculated using the Clausius–Mossotti relation, with the static polarizability obtained from PBE + MBD43,94. Because the intermolecular SF process involves charge/energy transfer between neighboring chromophores, we also consider a descriptor of the intermolecular electronic coupling, the transition matrix element, \(H_{{{{\mathrm{ab}}}}} = \left\langle {{{{\mathbf{{\Phi}}}}}_{{{\mathrm{a}}}}|{{{\hat{\mathbf H}}}}|{{{\mathbf{{\Phi}}}}}_{{{\mathrm{b}}}}} \right\rangle\), between the initial state \({{{\mathbf{{\Phi}}}}}_{{{\mathrm{a}}}}\) of molecule a, and the final state \({{{\mathbf{{\Phi}}}}}_{{{\mathrm{b}}}}\) of molecule b, where \({{{\hat{\mathbf H}}}}\) is the Hamiltonian. For hole transport, molecule a is positively charged and molecule b is neutral. The states \({{{\mathbf{{\Phi}}}}}_{{{\mathrm{a}}}}\)and \({{{\mathbf{{\Phi}}}}}_{{{\mathrm{b}}}}\) represent the corresponding HOMO. \(H_{{{{\mathrm{ab}}}}}\) is calculated within the frozen orbital DFT approach95,96,97. Different dimers extracted from the same molecular crystal result in different values of \(H_{{{{\mathrm{ab}}}}}\). Hence, we use the average of the three highest \(H_{{{{\mathrm{ab}}}}}\) values to represent the intermolecular coupling strength in a given crystal. Finally, we consider chemical descriptors, including the molecular weight \({{{\mathrm{MolWt}}}}^{{{\mathrm{S}}}}\), the crystal density \(\rho ^{{{\mathrm{C}}}}\), and the number of atoms in the unit cell \({{{\mathrm{AtomNum}}}}^{{{\mathrm{C}}}}\). A full list of the primary features and their descriptions is provided in Supplementary Table 1.

To evaluate the relative computational cost of calculating different primary features, we used a representative system with 62 atoms per molecule and a total of 248 atoms (four molecules) in the unit cell. The CPU time spent on one single-molecule DFT@PBE calculation is considered the basic unit of computational cost. The computational cost of calculating each primary feature is expressed as multiples of that basic unit. For features whose evaluation requires multiple DFT calculations (for example, \({{{\mathrm{EA}}}}^{{{\mathrm{S}}}}\) requires two DFT@PBE calculations for the neutral and anion) the computational cost of all calculations is summed up. Descriptors such as \(\rho ^{{{\mathrm{C}}}}\) do not require any calculations and therefore have a cost of zero. A full list of the primary features with their relative computational cost is provided in Supplementary Table 2.

Model generation with SISSO

The SISSO training was performed with the SISSO package available at the SISSO GitHub Repository:67 https://github.com/rouyang2017/SISSO. SISSO can generate a huge feature space with billions (or even trillions) of elements by iteratively combining the primary features using linear and nonlinear elementary mathematical operations67. To avoid generating unphysical features, addition and subtraction are allowed only for primary features with the same units. Two key parameters of SISSO are the model dimension and feature rung, which is the number of iterations used to build combined features. Here, the maximal rung (Rung) was set to 3 and the maximal dimension (Dim) was set to 4. These values are found to be sufficient to identify the optimal model complexity, as shown below. The resulting models are denoted as MDim, Rung. The operator set \(H = \left\{ { + , - , \times , \div ,\exp ,\log ,()^{ - 1},()^2,()^3,\sqrt {} ,\root {3} \of {{}},\left| \cdot \right|} \right\}\) was used for feature construction. The maximum complexity, i.e., the maximum number of operators in one combined feature, was set to 10. With these settings, a total of 584, 5 × 105, and 5 × 1011 features were generated with Rung = 1, 2, and 3, respectively.

After feature generation, linear regression is performed to yield the model prediction (each model is the scalar product of the SISSO-identified descriptor with the vector of fitted coefficients, via linear regression) and the models are ranked based on their prediction performance. Optimal subspaces are selected from the huge feature space by sure-independence screening (SIS). The number of features saved after SIS is set to 500. On each such subspace, the sparse solution is determined by l0 normalization (the sparsifying-operator, SO). To assess the optimal model complexity (i.e., Rung and Dim), leave-N-out cross-validation (LCV) is performed, i.e., the performance of the trained models is assessed on unseen data. N data points are held out as an unseen validation set and the remaining data points are used for model training. This process is repeated several times. Here, we use N = 10. In the LCV practice, data points are typically randomly assigned to the validation set. Here, rather than the model with the smallest overall prediction error, we are interested in a regression model with higher prediction accuracy at the high SF driving force range in order to identify promising SF candidates with high confidence. Hence, a modified LCV scheme is used, which prioritizes the selection of PAHs with a higher SF driving force than rubrene for the validation set. The selection probability of materials with \(E_{{{\mathrm{S}}}} - 2E_{{{\mathrm{T}}}} \ge - 0.62{{{\mathrm{eV}}}}\) is boosted by a factor of 10 compared to other PAHs. For each combination of Rung and Dim, 40 rounds of LCV are performed. In each round, the model with the lowest RMSE for the validation set is selected. Finally, the model that yields the lowest RMSE of the 40 models for the combined training and validation data is selected as MDim,Rung. We note that the regression coefficients may have units, such that the overall units of the resulting models are eV. A subset of 10 PAH crystals of different sizes with a range of SF driving force values are completely left out of the SISSO training to serve as the test set of unseen data. The SISSO training is performed using the remaining 91 crystals. As a baseline for assessing the performance of SISSO-generated models we use our human-generated models, the DFT estimates of the single molecule and crystal SF driving force DFS and DFc.

Because each SISSO-generated model comprises different primary features, each model has a different computational cost. Here, the computational cost for each model is evaluated by summing over the costs of all the primary features included in the model. The cost of features that appear in the model more than once is counted only once because no additional calculation is required. As mentioned above, SISSO is adopted to train regression models, i.e., predicting the SF driving force, by minimizing the prediction error. However, the same model can also be assessed (without retraining) as a classification model if the two classes of interest are SF vs. non-SF materials. To this end, the materials are classified based on the value of the SF driving force with a threshold of −0.62 eV, corresponding to the SF driving force of rubrene, as explained above. True positive and true negative are defined here based on whether the ML model is in agreement with the GW+BSE reference data regarding whether or not the SF driving force of a certain material is above or below −0.62 eV. The classification performance of each model is assessed based on sensitivity, specificity, and accuracy. Sensitivity is the fraction of correctly identified SF candidates, defined as the number of true positives (TP) divided by the total number of positive labels, which includes true positives and false negatives (FN), TP/(TP + FN). Conversely, specificity is the fraction of correctly identified non-SF candidates, defined as the number of true negatives (TN) divided by the total number of negative labels, which includes true negatives and false positives (FP), TN/(TN + FP). Accuracy measures the overall fraction of correct classifications, which is given by the sum of true positives and true negatives divided by the sum of all labels, (TP + TF)/(TP + FN + TN + FP). The classification performance of all SISSO-generated models for the test set and training set are reported in Supplementary Tables 3, 4, respectively.

Model selection and performance evaluation

Table 1 summarizes the training set and test set RMSE of the best models produced by SISSO with each combination of Dim and Rung. The training set comprises all data used for training, including both the training and validation data in all cross-validations, and the test set comprises the ten data points unseen by the SISSO training process. The formulas of all models are provided in the Supplementary Notes and some models are selected for further discussion in the main text. Overall, all the SISSO-generated models have a higher prediction accuracy than the baseline models DFS and DFC. Both SISSO-generated models and the baseline DFT estimation model perform significantly better than the mean value. The prediction error is expected to decrease with the model complexity, i.e., with increasing Rung and Dim, until a saturation point is reached, beyond which the accuracy deteriorates because of overfitting. For Rung = 1 models, the training set RMSE decreases monotonically with increasing model dimension. The test set RMSE, however, peaks at M2,1 with both three and four-dimensional models achieving lower RMSE. The better performance of higher dimensional models indicates that the SISSO training does not saturate at M2,1. Rather, some PAHs may be more sensitive to the descriptors included in M2,1. The improvements from three dimensions to four dimensions for both the training and test sets are marginal, suggesting that the model complexity has saturated. For Rung = 2 models, the training RMSE decreases with increasing dimension, whereas the test RMSE increases slightly for M3,2. The slightly worse performance of M3,2 for the test set, compared to M2,2 and M4,2, is negligible, suggesting the model complexity is saturating but the optimum is not reached. For models with Rung = 3, the training RMSE decreases monotonically with the increase in model dimension. However, for the test set, the model performance deteriorates significantly from two dimensions to three and four dimensions. This suggests that Dim = 2 is the saturation point. Similarly, increasing the Rung for models with the same dimension improves the accuracy until an optimum is reached. In general, at fixed Dim, the test RMSE shows a minimum at Rung = 3 for one and two-dimensional models, Rung = 2 for three and four-dimensional ones. The overall lowest test RMSE of 0.18 eV is achieved with M2,3, suggesting that this model has the optimal complexity. We note that most of the features included in the low-complexity models are single-molecule properties. These results imply that the SF driving force is heavily dependent on the molecular characteristics. However, because the PAH101 set only contains four sets of polymorphs (rubrene, perylene, diindeno[1,2,3-cd:1′,2′,3′-lm]perylene, and p-quaterphenyl), the effect of crystal packing may be underrepresented.

Table 1 The computational cost and prediction accuracy, represented by the RMSE for the training set and test set, of SISSO-generated models.

To decide which model(s) to use for materials screening, we consider the computational cost in addition to the model accuracy. The relative computational cost of SISSO-generated models is given in Table 1. Figure 2 shows a Pareto chart, in which the model accuracy, represented by the validation and test set RMSE, is plotted against its relative computational cost. The validation set RMSE is calculated using the corresponding train/validation split that produces the final SISSO model. A Pareto chart based on the training RMSE is provided in Supplementary Fig. 3, which leads to similar conclusions. More complex models tend to have a higher computational cost because they require evaluating more primary features. However, some primary features have a higher computational cost than others. In general, crystal features cost more than single-molecule features. Therefore, models with similar complexity may have a different computational cost depending on the specific features they contain. It is worth noting that a GW + BSE@PBE calculation for a mid-sized molecular crystal with 180 atoms per unit cell may consume more than 106 CPU hours, which is higher than the computational cost of all the primary features by a factor of 104. Both M1,1 and M1,2 are on the Pareto front. However, M1,2 yields a lower validation RMSE with the same computational cost. Hence, M1,1 is not considered further for materials screening. M2,3 and M4,3 are on the validation set Pareto front. M2,3 is also on the test set Pareto front. The test set RMSE for M4,3 suggests this model may overfit the training data. Therefore, M2,3 is selected as a second-level screening model after M1,2.

Fig. 2: Pareto charts.
Fig. 2: Pareto charts.The alternative text for this image may have been generated using AI.
Full size image

Pareto Chart of the accuracy, represented by the a training set validation RMSE and b unseen test set RMSE, vs. the relative computational cost of SISSO-generated models. The dashed line indicates the Pareto front.

In order to evaluate the model performance across the PAH101 training set, in particular for materials in the region of interest for SF, Fig. 3 shows correlation plots between the model prediction and the reference values of the SF driving force obtained with GW + BSE@PBE. A correlation plot for the baseline human-generated model, \({{{\mathrm{DF}}}}^{{{\mathrm{S}}}}\), is also shown for comparison. The correlation plots for the training set and test set for all SISSO-generated models are provided in Supplementary Figs. 1, 2. As shown in Fig. 3a, \({{{\mathrm{DF}}}}^{{{\mathrm{S}}}}\) systematically underestimates the SF driving force. The SISSO-generated models are overall more predictive than the baseline human-generated model. For the models on the Pareto front, the training set RMSE gradually decreases with the model complexity. A few systems, whose molecular structures are shown in Fig. 3, consistently appear as outliers across models. The majority of the outliers comprise benzene rings connected by a single covalent bond, whereas most of the systems in the PAH101 set are conjugated aromatic compounds, in which interconnected rings share extended \({\uppi}\)-orbitals. Hence, the lower prediction accuracy for these systems may be attributed to their somewhat different chemistry. Because most of these outliers are not in the region of interest for SF, they are not a cause for concern. One outlier in the SF candidate range is the zethrene derivative 7,14-Di-n-butyldibenzo[de,mn]naphthacene (CSD reference code KAGGIK)98. Its SF driving force is significantly underestimated by most SISSO-generated models (except for M4,3). Because such errors are not observed for other zethrene derivatives, we attribute this to the long alkyl side chains of KAGGIK, which make it chemically distinct from most other chromophores in the PAH101 set.

Fig. 3: Correlation plots of selected models.
Fig. 3: Correlation plots of selected models.The alternative text for this image may have been generated using AI.
Full size image

Model prediction as a function of the GW + BSE SF driving force for a the baseline model, DFS, and the five models on or close to the Pareto front: b M1,1, c M1,2, d M3,1, e M2,3, and f M4,3. The training set RMSE and molecular structures of four outliers are also shown. The true positive region for SF candidates is colored in yellow.

Hierarchical screening workflow

We propose a hierarchical screening approach based on different SISSO-generated models with increasing cost and accuracy to gradually narrow down the candidate pool. To select models for hierarchical screening we also consider their classification performance, shown in Table 2. Correct classification of candidate materials is important in order for the promising SF candidates to proceed to the next step of screening and the non-promising candidates to be discarded. If a false positive occurs, a material is misclassified as promising, in which case it proceeds to screening with more accurate models and may be discarded subsequently. However, if a false negative occurs, a material is misclassified as non-promising and discarded, which results in the loss of a promising candidate. Therefore, screening thresholds should be set to avoid false negatives and tolerate a small number of false positives. The hierarchical screening workflow is illustrated in Fig. 4 for the PAH101 set. The first stage of screening is performed with the low-cost model M1,2:

$$M_{1,2} = 0.36 \times \left( {{{{\mathrm{Gap}}}}^{{{\mathrm{S}}}} - {{{\mathrm{EA}}}}^{{{\mathrm{S}}}}} \right) \times \left( {{{{\mathrm{DF}}}}^{{{\mathrm{S}}}} \times \rho ^{{{\mathrm{C}}}}} \right) + 0.33$$
(1)

M1,2 only requires three DFT calculations for a single molecule and the crystal density, which requires no calculations, and yields an RMSE of 0.22 eV. As shown in Table 2, similar to the other SISSO-generated models, M1,2 yields 100% sensitivity for the training set. However, one of the three additional SF candidates in the test set is not correctly classified, resulting in a sensitivity of 0.67. Both the training set and test set produce almost 100% specificity, implying high confidence in the classification of non-SF candidates. In order to correctly classify all SF materials, the selection threshold is adjusted by subtracting the model RMSE of 0.22 eV from the true positive threshold of −0.62 eV, to give a threshold of −0.84 eV. With this threshold, all 24 SF candidates in the PAH101 set and nine non-promising materials pass the first stage of screening. Thus, model M1,2 already eliminates the vast majority of non-SF materials in the dataset.

Table 2 Classification performance of the SISSO-generated models in terms of sensitivity, specificity, and accuracy with respect to the SF driving force threshold of −0.62 eV.
Fig. 4: Hierarchical screening workflow.
Fig. 4: Hierarchical screening workflow.The alternative text for this image may have been generated using AI.
Full size image

Schematic of the hierarchical screening workflow based on models M1 2, and M2,3, applied to the PAH101 set.

As shown in Figure 2a, M2,3 yields a significantly higher accuracy at a computational cost that is about 20 times higher than that of M1,2. Equation 2 shows the features included in the model:

$$\begin{array}{l} {M_{2,3} = - 0.35 \times \frac{{\left(E_{{{\mathrm{T}}}}^{{{\mathrm{C}}}} + {{{\mathrm{EA}}}}^{{{\mathrm{S}}}}\right) \times \left(E_{{{\mathrm{T}}}}^{{{\mathrm{S}}}} \times \rho ^{{{\mathrm{C}}}}\right)}}{{\log \left( {{{{\mathrm{AtomNum}}}}^{{{\mathrm{C}}}}} \right)/\left({{{\mathrm{AtomNum}}}}^{{{\mathrm{C}}}}\right)^{\frac{1}{3}}}}} \\ \qquad\qquad+\,{ 4.25 \times \frac{{{{{\mathrm{log}}}}\left(\rho ^{{{\mathrm{C}}}}\right) \times \left({{{\mathrm{EA}}}}^{{{\mathrm{S}}}} \,-\, {{{\mathrm{CB}}}}_{{{{\mathrm{disp}}}}}^{{{\mathrm{C}}}}\right)}}{{{{{\mathrm{EA}}}}^{{{\mathrm{S}}}}/{{{\mathrm{CB}}}}_{{{{\mathrm{disp}}}}}^{{{\mathrm{C}}}}\, - \,{{{\mathrm{VB}}}}_{{{{\mathrm{disp}}}}}^{{{\mathrm{C}}}}/{{{\mathrm{EA}}}}^{{{\mathrm{S}}}}}} + 0.61} \end{array}$$
(2)

The only single-molecule features included in \(M_{2,3}\) are the electron affinity EAS and triplet-state formation energy, \(E_{{{\mathrm{T}}}}^{{{\mathrm{S}}}}\). The remaining features are crystal features, including the crystal density \(\rho ^{{{\mathrm{C}}}}\), the number of atoms in the unit cell, the conduction band and valence band dispersion, \({{{\mathrm{CB}}}}_{{{{\mathrm{disp}}}}}^{{{\mathrm{C}}}}, {{{\mathrm{VB}}}}_{{{{\mathrm{disp}}}}}^{{{\mathrm{C}}}}\), and the triplet-state formation energy, \(E_{{{\mathrm{T}}}}^{{{\mathrm{S}}}}\). M2,3 achieves almost 100% classification accuracy for the training set. In addition, \(M_{2,3}\) yields 100% on all three metrics of sensitivity, specificity, and accuracy for the test set. Based on its performance, \(M_{2,3}\) is selected for the second stage of screening with a selection threshold of −0.62−0.15 = −0.77 eV, where 0.15 eV is the training set RMSE. We note that some materials admitted by the threshold of −0.77 eV could turn out to be promising for SF if renormalization of the exciton energies due to phonons is considered in post-processing75. At the second level of screening, all 24 SF candidates in the PAH101 set and four non-promising materials pass, filtering out almost half of the non-promising candidates from the first stage. Owing to the high computational cost of GW + BSE calculations, every non-promising material filtered out may save 105–106 CPU hours.

The variance between the predictions of different models for a given material may be used as a measure of uncertainty. Figure 5 shows the range of predictions produced by the two models selected for the hierarchical screening workflow, \(M_{1,2}\) and \(M_{2,3}\) for all the materials in the PAH101 set, arranged in order of increasing SF driving force from left to right. For almost 90% of the PAH101 set, the predictions of the two models are within 0.2 eV of each other. Most of the materials for which the predictions of the two models significantly diverge are outside of the promising region for SF. As shown in Fig. 5, the three materials with high prediction uncertainty in the non-SF candidate region are molecules with singly-bonded benzene rings and a graphene nanoflake. Both classes are rare in the PAH101 set, leading to a high uncertainty between different models due to insufficient training data. In the SF candidate region, no significant uncertainty is observed. The improved model performance in the region of interest for SF may be attributed to the preferential selection of materials from this region for the LCV validation set. One material, the zethrene derivative 7,14-Di-n-butyldibenzo[de,mn]naphthacene (CSD reference code KAGGIK) has a relatively high prediction error. KAGGIK is a zethrene derivative with two long alkyl side groups, making it chemically distinct from most of the PAH101 set. Most of the materials with high prediction variance are the same outliers, for which the models with lower complexity have high prediction errors in Fig. 3. Within a hierarchical screening workflow, materials for which the predictions of different models significantly diverge may be selected for GW+BSE calculations even if they are not promising candidates for SF for the purpose of model refinement.

Fig. 5: Uncertainty analysis.
Fig. 5: Uncertainty analysis.The alternative text for this image may have been generated using AI.
Full size image

The range of predictions produced by models M1,2 and M2,3 for the PAH101 set. The materials are arranged in order of increasing GW+BSE SF driving force from left to right. The red dots indicate the GW + BSE@PBE SF driving force, and the blue error bars represent the prediction range of the two SISSO models. The region of promising SF candidates is highlighted in yellow and magnified in the inset. Molecular structures of non-SF materials with a prediction range higher than 0.4 eV and the SF material with the highest prediction error are also shown.

Promising SF candidates

Further analysis is performed, using GW+BSE, for the materials that are consistently classified as promising by the selected SISSO-generated models. For most of the promising SF candidates in the PAH101 set, including pentacene, tetracene, rubrene, quaterrylene, phenylated acenes, pyrene-fused acenes, and zethrene derivatives, detailed analyses have been published elsewhere16,29,42,43,44. Three additional promising SF candidates discovered among the materials studied here are BCPP, TBPT, and DPNP. Their crystal structures, reported in refs. 99,100,101, are visualized in Fig. 6. These compounds belong to chemical families of PAHs not previously explored in the context of SF. BCPP and DPNP are non-alternant PAHs containing five-membered rings fused with six-membered rings. TBPT is somewhat reminiscent of a rylene. In Fig. 7 BCPP, TBPT, and DPNP are compared to the known SF materials tetracene, rubrene, diphenyltetracene (DPT), and diphenylpentacene (DPP) with respect to a two-dimensional descriptor for SF performance16,29,43,44. The primary descriptor is the SF driving force, plotted on the x-axis. A high driving force indicates that a material is likely to undergo SF at a high rate. However, an overly high driving force would lead to energy losses in solar energy conversion. Therefore, a driving force between tetracene and pentacene is considered optimal.

Fig. 6: Crystal structures of SF candidates.
Fig. 6: Crystal structures of SF candidates.The alternative text for this image may have been generated using AI.
Full size image

The crystal structures of BCPP, TBPT, and DPNP. The carbon and hydrogen atoms are colored in brown and white, respectively.

Fig. 7: 2D descriptor of SF performance.
Fig. 7: 2D descriptor of SF performance.The alternative text for this image may have been generated using AI.
Full size image

BCPP, TBPT, and DPNP compared to known SF materials, colored in red, with respect to a two-dimensional descriptor calculated with GW + BSE. The thermodynamic driving force for SF (\(E_{{{\mathrm{S}}}} - 2E_{{{\mathrm{T}}}}\)) is displayed on the x-axis and the singlet exciton charge transfer character (%CT) is displayed on the y-axis.

The secondary descriptor, displayed on the y-axis, is the degree of charge transfer character (%CT) of the singlet exciton wave function. This descriptor is motivated by the growing body of experimental evidence for the involvement of an intermediate charge transfer state in the SF process4,102,103,104,105. A singlet exciton with a high degree of charge transfer character, i.e., with the hole and the electron probability distributions centered on different molecules, is thought to be favorable for SF4,21,106,107,108. The SF driving force of BCPP is comparable to tetracene but its %CT is significantly lower. Considering the relatively slow fission rate in crystalline tetracene109,110,111, slow SF could be observed in the BCPP crystal. DPNP has a comparable SF driving force to that of DPT and a much higher %CT of almost 90%. TBPT has a slightly lower SF driving force than pentacene and a comparable %CT. Based on this, DPNP and TBPT may undergo faster SF than tetracene with a smaller energy loss than pentacene.

In summary, to accelerate the computational discovery of potential materials for intermolecular singlet fission in the solid state, we have used machine learning to generate models that are fast to evaluate and accurately predict the thermodynamic driving force, which is the primary criterion for singlet fission to occur. To this end, a dataset of GW + BSE calculations of the SF driving force of 101 polycyclic aromatic hydrocarbons (PAH101) was compiled. The SISSO machine-learning algorithm was used to generate models with a varying degree of complexity by combining physically motivated primary features. Subsequently, the most predictive models were selected by linear regression with cross-validation.

Several SISSO-generated models demonstrated good prediction performance with a training set RMSE below 0.2 eV. The accuracy of the SISSO-generated models exceeded by far the accuracy of human generated baseline models based on DFT estimates of the single molecule and crystal SF driving force. The few outliers, most of which were outside the region of interest for SF, were somewhat chemically different than most chromophores in the PAH101 set. Based on considerations of cost, accuracy, and classification performance we have proposed a hierarchical screening workflow comprising two SISSO-generated models with increasing cost and accuracy. Thresholds were set based on model RMSE to allow a small number of false positives while ensuring that no viable SF candidates were missed. All 24 promising SF candidates in the PAH101 set successfully passed through the workflow with only four false positives. In a materials screening scenario, GW + BSE calculations would be performed only for the materials that pass all stages of the SISSO-based screening. In addition, we have proposed using the variance in the predictions of different SISSO-generated models for a given material as a measure of uncertainty. A large variance in the SISSO model predictions for a certain material may indicate that it should be selected for GW + BSE calculations, even if it is not a promising SF candidate, for the purpose of model retraining and refinement.

Finally, three potentially promising SF materials that have not been reported previously were discovered in the PAH101 set: BCPP, TBPT, and DPNP. For these materials, further analysis was performed using GW + BSE. They were compared to known SF materials with respect to a two-dimensional descriptor based on the thermodynamic driving force and the singlet exciton charge transfer character. BCPP was found to have a thermodynamic driving force comparable to tetracene but a significantly lower CT character, indicating that it may undergo slow singlet fission. TBPT and DPNP were found to have a thermodynamic driving force between tetracene and pentacene and a high degree of singlet exciton CT character. This indicates that they may undergo faster SF than tetracene with a smaller energy loss (higher energy efficiency) than in pentacene. BCPP, TBPT, and DPNP belong to chemical families that have not been studied in the context of SF to date. This may help steer experimental efforts in new directions.

Thus, we have successfully used the SISSO machine-learning algorithm to find predictive models for excited-state properties of molecular crystals, whose computational cost is sufficiently low to enable large-scale screening in search of SF materials. In the future, we will use the SISSO-generated models to screen materials datasets. We note that the present models are not expected to perform well for materials that are significantly chemically different than PAHs because that would be an extrapolation. However, there are many additional PAHs in the CSD and PAH structures that continue to be solved and added at an increasing rate with the advent of 3D electron diffraction (e.g., ref. 45). As additional data are acquired the SISSO-generated models may be retrained and refined for more chemically diverse systems. A similar approach may be used for other materials discovery efforts where properties of interest are expensive to compute or measure, making training data scarce.

Methods

Primary feature calculation

Crystal features were evaluated for a locally-optimized geometry with the unit cell lattice vectors fixed at their experimental values. Single-molecule features were evaluated for molecules extracted from these locally-optimized crystal structures. The primary features were calculated using the FHI-aims package112,113 with the PBE functional, tight numerical settings, and tier-2 basis sets112. Details of the k-point grid settings for each crystal are provided in the Supplementary Information.

SF driving force calculation

The SF driving force of crystals was calculated after full unit cell relaxation. The Quantum ESPRESSO114 package was used to generate the mean-field eigenvalues and eigenfunctions using the PBE exchange-correlation functional with Troullier–Martins norm-conserving pseudopotentials115. The wave functions were generated using a kinetic energy cutoff of 50 Ry. The BerkeleyGW package116 was used to conduct many-body perturbation theory (MBPT) calculations within the GW approximation and to solve the Bethe–Salpeter equation (BSE). About 550 unoccupied bands were included in the calculation of the GW dielectric function and self-energy operator. The static remainder correction was applied to accelerate the convergence with respect to the number of unoccupied states117. Twenty-four valence bands and 24 conduction bands were included in the calculation of the BSE kernel. The Tamm–Dancoff approximation (TDA) was applied when solving the BSE116. The coarse and fine k-point grid settings for each crystal are provided in the Supplementary Discussions.