Fig. 2: Progress of AL suggestion and experimental acquisition and surrogate model interpretability.
From: Active learning accelerates electrolyte solvent screening for anode-free lithium metal batteries

a 2D t-SNE plots depicting location of suggested molecules (denoted by different scatter shapes) on the virtual chemical search space across different batches. The virtual chemical space in the background has been colored according to different functional group classes. b Performance of electrolytes acquired in each batch. Electrolytes are tested in Cu | |LFP cells and the discharge capacity at 20th cycle is used to compare performance. Capacity values are averaged from two replicated cells. The distribution and mean of performance in each batch is depicted by the vertical violin plots and square scatter points, respectively. c Fraction of exclusive ether molecules in the top suggestions in each batch. d Fraction of exclusive ether molecules in the molecules acquired from each batch. ‘Batch 0’ refers to the in-house dataset. e Fraction of common molecules in the top suggestions from pair of consecutive batches. f t-SNE plot depicting location of labeled electrolyte solvent molecules on the complete search space. Different colored scatter points indicate molecules acquired from the pool of top suggested molecules in each batch. Only structures of the electrolyte solvent molecules having highest discharge capacity in each of the batches 2 to 6 and all top-performing molecules in the batch 7 are shown here for clarity. The green, orange, and red tick marks denote the presence of the three important molecular substructures predicted to be most relevant by the SHAP. The 7 best electrolyte solvent molecules identified by the AL framework appear in the two bottom rows below the t-SNE and are marked by purple stars shown after the discharge capacities in parentheses. g SHAP summary plot for \({C}_{{norm}}^{20}\) prediction corresponding to rational quadratic (RQ) surrogate model obtained from batch 7. h Molecular substructures that contribute the most to the top molecular fingerprints predicted by SHAP towards \({C}_{{norm}}^{20}\) prediction. i Fraction of all labeled molecules containing the three molecular substructures — \({f}_{{solv}}^{6}\), \({f}_{{solv}}^{4}\), \({f}_{{solv}}^{5}\) (left y-axis; refer Supplementary Note 10) and average \({C}_{{norm}}^{20}\) (ground truth; \({ < C}_{{norm}}^{20} > \)) for electrolyte solvents containing these substructures (right y-axis). \({C}_{{norm}}^{20}\)= normalized discharged capacity at 20th cycle.