Closed-loop superconducting materials discovery

Pogue, Elizabeth A.; New, Alexander; McElroy, Kyle; Le, Nam Q.; Pekala, Michael J.; McCue, Ian; Gienger, Eddie; Domenico, Janna; Hedrick, Elizabeth; McQueen, Tyrel M.; Wilfong, Brandon; Piatko, Christine D.; Ratto, Christopher R.; Lennon, Andrew; Chung, Christine; Montalbano, Timothy; Bassen, Gregory; Stiles, Christopher D.

doi:10.1038/s41524-023-01131-3

Download PDF

Article
Open access
Published: 05 October 2023

Closed-loop superconducting materials discovery

Elizabeth A. Pogue¹^na1,
Alexander New¹^na1,
Kyle McElroy¹,
Nam Q. Le¹,
Michael J. Pekala¹,
Ian McCue ORCID: orcid.org/0000-0002-9393-1255^1,2,
Eddie Gienger¹,
Janna Domenico ORCID: orcid.org/0000-0002-5825-2994¹,
Elizabeth Hedrick³,
Tyrel M. McQueen ORCID: orcid.org/0000-0002-8493-4630^3,4,5,
Brandon Wilfong^4,5,
Christine D. Piatko ORCID: orcid.org/0000-0002-5295-9112¹,
Christopher R. Ratto¹,
Andrew Lennon¹,
Christine Chung¹,
Timothy Montalbano¹,
Gregory Bassen^4,5 &
…
Christopher D. Stiles¹

npj Computational Materials volume 9, Article number: 181 (2023) Cite this article

8011 Accesses
24 Citations
14 Altmetric
Metrics details

Subjects

Superconducting properties and materials

Abstract

Discovery of novel materials is slow but necessary for societal progress. Here, we demonstrate a closed-loop machine learning (ML) approach to rapidly explore a large materials search space, accelerating the intentional discovery of superconducting compounds. By experimentally validating the results of the ML-generated superconductivity predictions and feeding those data back into the ML model to refine, we demonstrate that success rates for superconductor discovery can be more than doubled. Through four closed-loop cycles, we report discovery of a superconductor in the Zr-In-Ni system, re-discovery of five superconductors unknown in the training datasets, and identification of two additional phase diagrams of interest for new superconducting materials. Our work demonstrates the critical role experimental feedback provides in ML-driven discovery, and provides a blueprint for how to accelerate materials progress.

Designing high-T_C superconductors with BCS-inspired screening, density functional theory, and deep-learning

Article Open access 22 November 2022

Predicting superconducting transition temperature through advanced machine learning and innovative feature engineering

Article Open access 17 February 2024

Interpretable discovery of semiconductors with machine learning

Article Open access 29 June 2023

Introduction

The discovery of novel materials drives industrial innovation^1,2,3, although the pace of discovery tends to be slow due to the infrequency of “Eureka!” moments^4,5. These moments are typically tangential to the original target of the experimental work: “accidental discoveries”. Here we demonstrate the acceleration of intentional materials discovery—targeting material properties of interest while generalizing the search to a large materials space with machine learning (ML) methods combined with experiment in a feedback loop. We demonstrate a closed-loop joint ML-experimental discovery process targeting unreported superconducting materials, which have industrial applications ranging from quantum computing to sensors to power delivery^6,7,8,9. By closing the loop, i.e., by experimentally testing the results of the ML-generated superconductivity predictions and feeding data back into the ML model to refine, we demonstrate that success rates for superconductor discovery can be more than doubled¹⁰. In four closed-loop cycles, we discovered an unreported superconductor in the Zr-In-Ni system, re-discovered five superconductors unknown in the training datasets, and identified two additional phase diagrams of interest for superconducting materials. Our work demonstrates the critical role experimental feedback provides in ML-driven discovery, and provides definite evidence that such technologies can accelerate discovery even in the absence of knowledge of the underlying physics.

Statistical approaches have long aimed to better understand and predict superconductivity¹¹, most recently through the use of black-box ML methods^{12,13,14,15,16,17,18}. Although resulting in numerous predictions, these studies have not yielded previously unreported families of superconductors, likely not only because of difficulties in extrapolating beyond known families, but also because the predicted materials have chemical attributes that make them unlikely to be superconducting—whether it is highly localized chemical bonding, e.g., those containing polyatomic anions, or an extreme metastability that precludes synthesizability. Further, existing works have treated materials and databases of material properties as fixed snapshots rather than evolving systems, which limits the ability of ML models to learn over sparse data.

Here we report on combining ML techniques with materials science and physics expertise to “close the loop” of materials discovery (Fig. 1). We demonstrate how to make ML models generalize across diverse materials spaces, to identify superconductors that are dissimilar to ones in the training corpus. By alternating between ML property prediction and experimental verification, we are able to systematically improve the fidelity of ML property prediction in regimes sparsely represented by existing materials databases. Crucially, this adds both negative data (materials incorrectly predicted to be superconductors) and positive data (materials correctly predicted) to ML training, enabling the ML model’s overall representation of the space of materials to be iteratively refined. The result is a ML model for predicting superconductivity that doubles the rate of successful predictions¹⁰, demonstrating the acceleration of materials discovery by combining human and machine insight.

**Fig. 1: The closed loop discovery process.**

Our process uses active learning¹⁹ to iteratively select data points to be added to a training set. In particular, we select materials that are both predicted to be high transition temperature (T_c) superconductors and are sufficiently distinct from known superconductors. We also leverage human domain expertise to further refine selections. When the predictive model incorrectly predicts non-superconductors as superconductors, this valuable negative data helps refine the model’s prediction surface.

A key attribute of our work is that the training data used in the ML models is not static, but evolves as the closed-loop process proceeds. A ML model that is employing a closed loop framework, actively sampling regions of previously unexplored spaces of materials, and continually acquiring new data cannot have a concise picture of convergence, and it is changed with every loop. Thus instead of a traditional convergence metric (e.g., looking for a flattening of loss versus number of training epochs for a convolutional neural network), we leverage goal-based metrics—when the model successfully predicts superconductors not in the training set or the human in the loop assesses that model outputs are sufficiently distinct and chemically plausible from prior predictions. This helps avoid model overfitting by terminating the process earlier than a traditional metric, while maximizing the usefulness of the new experimental data to further refine the model.

Utilizing this iterative “closed-loop” approach, we rediscover five known superconductors outside of the ML model’s training set, Table 1. These materials come from a wide variety of families: iron pnictides, doped 2D ternary transition metal nitride halides, and intermetallics, Table 2. We then further report the discovery of a previously unreported superconductor in the Zr-In-Ni phase diagram, and identified two other phase diagrams of interest (Zr-In-Cu and Zr-Fe-Sn).

Table 1 Superconductors rediscovered by machine learning.

Full size table

Table 2 Distribution of T_c values in SuperCon.

Full size table

Results and discussion

Model generation

For the initial prediction step of the closed-loop approach, we trained an ML model to predict the superconducting transition temperature, T_c, of candidate materials. Our primary source of training data, SuperCon²⁰, contains compositions of known superconductors. Only the materials’ compositions were used to train the ML model for predicting T_c since SuperCon did not contain additional structural information. Materials Project (MP)²¹ and Open Quantum Materials Database (OQMD)²², some of the largest public sets of computational materials data, supplied candidate compositions to be screened for superconductivity. These two databases do not contain any T_c data. These three datasets are visualized in Fig. 2 using a joint representation. Crucially, the amount of data for which we have superconducting information is much smaller than our other sources of data and is not uniformly sampled across the joint space.

**Fig. 2: Training data sparseness and finding non-derivative superconductors.**

It is well-known²³ that when ML methods make predictions on data outside of their training data distribution, accuracy often suffers; this is often called the out-of-distribution generalization problem. In cheminformatics²⁴, it is common to assess whether a dataset is within the distribution of a training dataset by seeing how far, in some representative metric space, its points are from the training dataset: as the difference between the distribution of new data and the training data increases, the likelihood that a model will accurately predict their properties decreases. To improve assessment of generalization, out-of-distribution data may be simulated by creating validation sets that split based on non-random criteria like Murcko scaffold²⁵ or cluster identity, the latter being the leave-one-cluster-out cross-validation (LOCO-CV) strategy²⁶.

In “Model Validation” in Methods, we apply LOCO-CV in a simulated superconductor-identification problem. We show that, although a strong ML model is capable of fitting the training set well and generalizing to out-of-distribution test data, it fails to make accurate predictions of superconducting status on out-of-distribution data. Because existing superconductor datasets are not sufficient to enable accurate identification of unreported superconductors, this motivates the need for multiple iterations of model training, candidate selection, candidate synthesis, and model retraining.

We rely on a recent ML model for chemical property prediction, Representation learning from Stoichiometry (RooSt)²⁷ (see “Computational Methods and Uncertainty” in Methods and the SI), to predict a material’s superconductivity using only its stoichiometry (i.e., ignoring the material’s crystal structure). Although not as immediately powerful as approaches incorporating structural information^16,28,29, it enables greater predictive sensitivity because materials compositions can be tested without knowledge of the structure.

Superconductivity-specific considerations

After training an ensemble of RooSt models using the SuperCon database, we apply them to our set of potential superconductors (i.e., MP and OQMD). We filter for materials likely to be high-T_c superconductors, and then selected materials are synthesized and characterized, enabling the ML model to be retrained in further loop iterations.

A risk of searching for superconductors from a static list of candidates is that while a material in MP or OQMD may not have the exact composition as a superconductor, it may have a composition extremely close in terms of stoichiometry, such as MgB₂ vs. Mg₃₃B₆₇. Thus, every time we produce a new list of candidates, we identify each candidate’s minimal Euclidean distance, in Magpie-space³⁰, to any point in our training data, and we remove candidates too close to SuperCon.

It is not practical to experimentally verify all ML predictions. The costs associated with fabricating and characterizing a new material are high; hence we are only able to experimentally analyze a small subset of the ML predictions.

The MP and OQMD databases both contain calculated stability information not used by the ML model. Of 190 predicted superconductors in a given prediction round, only 39 compounds were calculated to be stable (E_overhull = 0.00 eV/atom) but 83 were nearly stable (E_overhull < 0.05 eV/atom). Stable materials and those with prior experimental reports were prioritized to increase the likelihood that targeted compounds could be successfully synthesized. Prioritizing these materials ensured that failures to observe superconductivity were indicative of the behavior of the targeted compound rather than a failure to synthesize that compound.

Insulating materials like β-ZrNCl and the cuprates superconduct with high T_cs because they can be doped into a metallic state³¹. One long-running challenge for machine-learning approaches to predicting high-T_c superconductivity is that large bandgap insulators incapable of superconductivity tend to be given overweighted classification scores, likely due to the high T_cs of the cuprates¹⁶. Therefore, metals and easily doped materials were favored for testing. Similarly, for some predicted metals, we investigated nearby compounds with similar structures that were known in literature but were not found in MP or OQMD (e.g., Zr₃Fe₄Sn₄ and Hf₃Fe₄Sn₄^32,33) and isostructural compounds with promising band structures (e.g.,: ZrNi₂In).

Since the T_cs of compounds are very sensitive to alloy disorder and lattice parameter, we explored several compositions near each prediction³⁴. We also considered the ease and safety of synthesizing the target materials (e.g., by excluding extremely high-pressure syntheses). Powder X-ray diffraction (XRD) was used to ensure that the target material was successfully synthesized and temperature-dependent AC magnetic susceptibility was used to screen for superconductivity. Superconductors are perfectly diamagnetic below their T_c with minimal applied field.

Material candidate experimental verification

To illustrate the sensitivity of experimentally-measured T_cs to processing conditions, we made and tested samples with A₃B stoichiometry (Fig. 3a), including many known superconductors from the A15 family³⁵. Similar compositional sensitivity is common in other systems beyond A15 compounds. For example, as x varies between 0 and 0.35, La_2−xSr_xCuO₄ can vary from not superconducting to having a T_c up to 36 K¹⁵. Our experiments show that high-throughput synthesis and characterization techniques can reliably and quickly screen systems for superconductivity. Optimization of many superconducting phases requires much lower-throughput techniques for preparing phase-pure and fully-superconducting samples.

**Fig. 3: Experimental data for feedback and discovery.**

Using this closed-loop method and high-throughput synthesis, we re-discovered five known superconductors that were not represented in the ML training dataset. A list of these is found in Table 1. Alongside these successful predictions, the ML model also returned compositions that experts could readily identify as not superconducting candidates. Therefore, it was important to compare the successful prediction rates of the combined human expert-machine approach and the machine-only approach. If one considers all predictions (including those not identified as promising by the human in the loop), the rate of discovery is 5/190(2.6%), comparable to expert-driven success rates of (3%)¹⁰. When materials that experts quickly identified as not realistic superconductors were excluded (the human-machine combined approach), the successful prediction rate rose to 5/65(7.5%), more than double that of previous expert-driven approaches¹⁰. This is particularly remarkable given the chemical diversity in the predicted candidates.

We were then able to use this ML model to discover unreported superconductors. Specifically, we find a superconducting phase in the Zr-In-Ni system, with a T_c of ~9 K (Fig. 3b, c and Extended Data) and approximate composition ZrNiIn₄. No other known elements, binaries or ternaries in the Zr-In-Ni system would explain a superconducting transition temperature this high and the elements and binaries have been extensively investigated^12,35,36,37. Unfortunately, the phase responsible for superconductivity is extremely metastable, and we have not yet found a synthesis route to obtain it in single phase form (see SI).

Conclusions

We have presented the first ever “closed-loop" ML-based directed discovery of a superconductor with experimental verification (within the Zr-Ni-In system), identified two additional systems of interest (Zr-Cu-In and Zr-Fe-Sn), and rediscovered five others not represented in our ML training set.

Past revolutionary discoveries tended to happen by serendipity, finding something in material families outside of what was known at the time. Our approach, relying only on stoichiometry and a measure of “distance” from what is currently known, is more likely to find unreported materials of interest and a sense of where unexplored but promising materials lie compared to ML-guided approaches that proceed within only a given family of materials.

This approach improves performance with experience, in that with every closing of the loop, the ML model undergoes feedback and refinement, enabling efficient exploration of materials space. These improvements ultimately will reduce the cost of materials development and discovery. The success of this approach has been demonstrated by discoveries and rediscoveries coming from vastly different families, illustrating the potential of this tool for the discovery of materials with targeted properties. This methodology can be expanded to target more than one desired property, and applied to domains beyond superconductors as long as a mechanism for new data acquisition based on ML-based predictions can be leveraged.

Further, we engaged in only a small number of total prediction/experimental measurement iterations; to maximize the superconducting transition temperatures of superconductors discovered over further iterations, we can use acquisition functions developed for Bayesian Optimization^38,39. Our approach retains a human-in-the-loop for synthesizing and characterizing materials, but further automation is possible, involving, e.g., ML systems selecting experiments to be conducted, or robot-powered self-driving laboratories^40,41,42. Thus we demonstrate a viable approach of these methods to accelerate materials discovery.

Methods

Data

Our initial data source containing the superconducting transition temperature, T_c of many known compounds is the SuperCon database²⁰, published by the Japanese National Institute for Materials Science. More details and analyses about SuperCon are available in the SI.

In this work, we use the version of SuperCon released by Stanev et al.¹², available online. This contains 16,414 material compositions and associated critical temperature measurements. However, some of these compositions are invalid (e.g., Y2C2Br0.5!1.5) and were removed prior to analysis. Our final training dataset has 16,304 valid compositions. In the Extended Data and the SI, we give additional detail about our training dataset. Supplementary Fig. 1 shows the distribution of T_c values in our training data—note that the distribution is weighted toward low-T_c compositions.

We use MP²¹ and OQMD²² as the set of candidates to screen with ML for superconducting potential. MP and OQMD are some of the largest public sets of computational materials data. Their records contain full crystallographic information for material structures, along with some associated electronic and mechanical properties (but not, importantly, T_c). We scraped MP for material records present in it as of October 2020 using the MPRester class from the pymatgen⁴³ package, obtaining 89,341 unique compositions. We later downloaded the entire OQMD v1.4 database, obtaining 252,978 unique compositions. The Extended Data contains a table of MP and OQMD material IDs used in this study.

Computational methods and uncertainty

RooSt²⁷ is a graph neural network⁴⁴ that relates material composition to properties by applying a message-passing scheme⁴⁵ to a weighted graph representation of the composition’s stoichiometry, producing a real-valued embedding vector. To make a prediction, this embedding is then passed through a feedforward network.

In this work, we make use of the publicly-available implementation of RooSt, which is implemented in PyTorch⁴⁶. Furthermore, we use the default hyperparameters recommended by the RooSt authors, including basing the initial species representation vectors on the matscholar embedding⁴⁷. Since we seek materials likely to be high-T_c superconductors, and we expect RooSt’s classification model to poorly generalize on out of distribution data, we filter for materials predicted to be in the highest T_c tertile (T_c ≥ 20 K) with a classification score of at least 0.66 (see SI).

RooSt models incorporate two sources of uncertainty in their T_c predictions: We account for aleatoric uncertainty (randomness of input data) by letting a model estimate a mean and standard deviation for each label’s logit⁴⁸, and we incorporate epistemic uncertainty (error in the model’s result, itself) by averaging over an ensemble of independently trained RooSt models⁴⁹.

Problem formulation

We formulate our prediction problem as an uncertainty-aware classification task. As shown in Supplementary Information, the distribution of T_c values in SuperCon is skewed, with a large number of materials having T_cs close to 0 K. Although we could have used a regression approach and had models estimate T_c directly, the skewed and heavy-tailed T_c distribution instead prompted us to discretize T_c into three categories, based roughly on tertiles: materials with a measured T_c less than 2 K, materials with a T_c between 2 K and 20 K, and materials with a T_c above 20 K. This is similar to earlier work by Stanev et al.¹², who use a two-stage prediction approach where they first classify whether a material has a T_c of greater than 10 K. Depending on the specifics of the target property, our closed-loop discovery process can be used with other ML prediction formulations as well.

In this work, we characterize the similarity between material compositions using both the RooSt latent embedding (for predicting material properties) and via Euclidean distance applied to a material composition’s Magpie³⁰ representation, for determining if superconductor candidates are not sufficiently different than known superconductors to be considered a discovery. The choice of metric is not critical, as it is imposed simply to help broaden the range of materials space explored. Other works have considered alternative mechanisms for material similarity, just as using representations based on element fractions⁵⁰ or the earth mover’s distance⁵¹. Our discovery process does not rely on use of a specific similarity measure and can adopt other measures as desired.

Model validation

SuperCon provides data as a validation experiment for our model—can RooSt successfully predict the T_c tertile of unknown materials? We evaluate this question in two settings; the first under a standard uniform cross-validation (Uniform-CV) split of SuperCon, and the second with the LOCO-CV strategy²⁶. In this approach, we apply K-means clustering to the Magpie³⁰ representation of SuperCon and then train K RooSt models, iteratively holding out each cluster as a test set. Since the clustering will put materials that are similar to each other in the same cluster, LOCO-CV is a better proxy for assessing how well our model will perform when used to identify superconductor candidates in MP.

In this study, we set K = 3 for the clustering and summarize cluster characteristics in Table 3 and Fig. 4. Note that even this simple clustering procedure has produced inter-cluster heterogeneity—e.g., Cluster 0 is significantly smaller than the other clusters, and Cluster 1 has the bulk of the 20 ≤ T_c superconductors.

Table 3 LOCO-CV clustering.

Full size table

In Figs. 5 and 6, we show the results of our study. In the Uniform-CV setting, our model does well—it shows little evidence of overfitting and performs well for all three T_c categories. However, in LOCO-CV, performance degrades significantly and is also much more variable, based on what cluster is being used as the test set. Our result here echoes¹², who show that models trained only on iron-based superconductors fail to accurately predict properties of cuprates, and vice versa.

**Fig. 6: Test set precision and recall.**

These results indicate that we should not expect an ML model trained only on SuperCon to consistently identify superconductors in out-of-distribution data, and, as points in SuperCon are more similar to each other than points in MP and OQMD (Fig. 2), the LOCO-CV results here are optimistic compared to our actual problem of interest. This motivates our need for multiple iterations of model training, candidate selection, candidate synthesis, and model retraining.

Closed-loop discovery process

The initial loop iteration used Stanev et al.’s version of SuperCon¹² as training data (“Data”). After the T_c-prediction model was trained, candidates were selected from MP²¹ based on predicted scores (“Computational Methods and Uncertainty”). The second loop iteration used SuperCon, as well as additional measurements from the first loop, as training data, and it again used MP as the set of possible candidates. The third and fourth loops again used prior iterations’ measurements as supplementary training data, but they also combined OQMD²² to MP to obtain the set of possible candidates. The number of materials synthesized and characterized per loop iteration varied across loops, based on domain expert intuition and feasibility of synthesis. This process is summarized in Table 4.

Table 4 A summary of the closed-loop iterations.

Full size table

Experiment

To synthesize compounds in a medium-throughput manner, arc melting and solid state techniques were used. The standard sample size was 500–700 mg. A list of precursors used in this project is found in Supplementary Table 1 in the SI and details of the synthetic procedures are found in the SI. Additional heat treatments were performed on an as-needed basis when isolating superconducting phases.

Powder XRD patterns were collected at room temperature on the as-melted samples using a Bruker D8 Focus powder diffractometer with Cu-Kα radiation (λ_k,α,1 = 1.540596 Å, λ_k,α,2 = 1.544493 Å), Soller slits, and a LynxEye detector to verify the presence of the target phase. We measured from 2θ = 5^∘–60^∘ with a step size of 0.018563^∘ over 4 min as an initial screen. When gathering XRD patterns of samples in preparation for Rietveld refinement, 4 h measurements were performed from 2θ = 5^∘–120^∘ with a step size of 0.01715^∘.

AC-susceptibility measurements were conducted using either a Quantum Design Magnetic Properties Measurement (MPMS) System (H_DC = 10 Oe, H_AC = 1 − 3 Oe, 900 Hz) or a Quantum Design Physical Properties Measurement System (H_DC = 10 Oe, H_AC = 3 Oe, 1 kHz), measuring T ≥ 2 K. Since prior density function theory (DFT) calculations⁵² suggested that CaAg₂Ge₂ would superconduct near T = 1.5 K, we used the ³He option with the MPMS to measure from 0.4 K to 1.7 K for that sample in addition to our standard measurement above 2 K.

Data availability

The data used to train the models used in the work were acquired from publicly available sources, and those acquired experimentally according to the procedures explained in the paper. The values acquired experimentally as part of this work are available at https://doi.org/10.34863/w3xm-w506.

References

National Research Council, Frontiers in Crystalline Matter. Washington, DC: The National Academies Press https://doi.org/10.17226/12640 (2009).
de Pablo, J. J. et al. New frontiers for the materials genome initiative. npj Comput. Mater. 5, 41 (2019).
Google Scholar
Baird, S. G., Diep, T. Q. & Sparks, T. D. Discover: a materials discovery screening tool for high performance, unique chemical compositions. Digit. Discov. 1, 226–240 (2022).
CAS Google Scholar
National Science and Technology Council, Materials Genome Initiative Strategic Plan: a Report by the Subcommittee on the Materials Genome Initiative Committee on Technology of the National Science and Technology Council. https://www.mgi.gov/sites/default/files/documents/MGI-2021-Strategic-Plan.pdf (2021).
Mandrus, D. Gifts from the superconducting curiosity shop. Front. Phys. 6, 347–349 (2011).
Google Scholar
Zhao, H. et al. Cascade of correlated electron states in the kagome superconductor CsV₃Sb₅. Nature 599, 216–221 (2021).
CAS Google Scholar
Li, Y., Xu, X., Lee, M.-H., Chu, M.-W. & Chien, C. L. Observation of half-quantum flux in the unconventional superconductor beta-Bi₂Pd. Science 366, 238–241 (2019).
CAS Google Scholar
Mather, J. C. Super photon counters. Nature 401, 654–655 (1999).
CAS Google Scholar
Grant, P. Rehearsals for prime time. Nature 411, 532–533 (2001).
CAS Google Scholar
Hosono, H. et al. Exploration of new superconductors and functional materials, and fabrication of superconducting tapes and wires of iron pnictides. Sci. Technol. Adv. Mater. 16(3) https://doi.org/10.1088/1468-6996/16/3/033503 (2015).
Hirsch, J. Correlations between normal-state properties and superconductivity. Phys. Rev. B. 55, 9007–9024 (1997).
CAS Google Scholar
Stanev, V. et al. Machine learning modeling of superconducting critical temperature. npj Comput. Mater. 4, 29 (2018).
Google Scholar
Zeng, S. et al. Atom table convolutional neural networks for an accurate prediction of compounds properties. npj Comput. Mater. 5, 84 (2019).
Google Scholar
Konno, T. et al. Deep learning model for finding new superconductors. Phys. Rev. B 103, 014509 (2021).
CAS Google Scholar
Roter, B. & Dordevic, S. V. Predicting new superconductors and their critical temperatures using machine learning. Phys. C: Supercond. 575, 1353689 (2020).
CAS Google Scholar
Quinn, M. R. & McQueen, T. M. Identifying new classes of high temperature superconductors with convolutional neural networks. Front. Electron. Mater. 2, 1–12 (2022).
Google Scholar
Hoffmann, N., Cerqueira, T. F. T., Schmidt, J. & Marques, M. A. L. Superconductivity in antiperovskites. npj Comput. Mater. 8, 150 (2022).
CAS Google Scholar
Seegmiller, C.C, Baird, S.G, Sayeed, H.M, Sparks, T.D. Discovering chemically novel, high-temperature superconductors https://doi.org/10.26434/chemrxiv-2023-8t8kt-v3 (2023).
Lookman, T., Balachandran, P. V., Xue, D. & Yuan, R. Active learning in materials science with emphasis on adaptive sampling using uncertainties for targeted design. npj Comput. Mater. 5, 21 (2019).
Google Scholar
SuperCon (2008). https://supercon.nims.go.jp/ Accessed 2021
Jain, A. et al. Commentary: the materials project: a materials genome approach to accelerating materials innovation. APL Mater. 1, 011002 (2013).
Google Scholar
Saal, J. E., Kirklin, S., Aykol, M., Meredig, B. & Wolverton, C. Materials design and discovery with high-throughput density functional theory: the open quantum materials database (oqmd). JOM 65, 1501–1509 (2013).
CAS Google Scholar
Gulrajani, I, Lopez-Paz, D. In search of lost domain generalization. In: International Conference on Learning Representations https://openreview.net/forum?id=lQdXeXDoWtI (2021).
Liu, R. & Wallqvist, A. Molecular similarity-based domain applicability metric efficiently identifies out-of-domain compounds. J. Chem. Inf. Model. 59, 181–189 (2019).
CAS Google Scholar
Bemis, G. W. & Murcko, M. A. The properties of known drugs. 1. molecular frameworks. J. Med. Chem. 39, 2887–2893 (1996).
CAS Google Scholar
Meredig, B. et al. Can machine learning identify the next high-temperature superconductor? examining extrapolation performance for materials discovery. Mol. Syst. Des. Eng. 3, 819–825 (2018).
CAS Google Scholar
Goodall, R. E. A. & Lee, A. A. Predicting materials properties without crystal structure: deep representation learning from stoichiometry. Nat. Comm. 11, 6280 (2020).
CAS Google Scholar
Xie, T. & Grossman, J. C. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, 145301 (2018).
CAS Google Scholar
Park, C. W. & Wolverton, C. Developing an improved crystal graph convolutional neural network framework for accelerated materials discovery. Phys. Rev. Mater. 4, 063801 (2020).
CAS Google Scholar
Ward, L., Agrawal, A., Choudhary, A. & Wolverton, C. A general-purpose machine learning framework for predicting properties of inorganic materials. npj Comput. Mater. 2, 16028 (2016).
Google Scholar
Fournier, P. T’ and infinite-layer electron-doped cuprates. Phys. C: Supercond. 514, 314–338 (2015).
CAS Google Scholar
Calta, N. P. & Kanatzidis, M. G. Hf₃Fe₄Sn₄ and Hf₉Fe_4−xSn_10+x: two stannide intermetallics with low-dimensional iron sublattices. J. Solid State Chem. 236, 130–137 (2016).
CAS Google Scholar
Savidan, J. C., Joubert, J. M. & Toffolon-Masclet, C. An experimental study of the Fe-Sn-Zr ternary system at 900^∘C. Intermetallics 18, 2224–2228 (2010).
CAS Google Scholar
Matthias, B. T., Geballe, T. H., Willens, R. H., Corenzwit, E. & Hull Jr, G. W. Superconductivity of Nb₃Ge. Phys. Rev. 139, 1501–1503 (1965).
CAS Google Scholar
Matthias, B. T., Geballe, T. H. & Compton, V. B. Superconductivity. Superconductivity 35, 1–22 (1963).
CAS Google Scholar
Shaw, R. W., Mapother, D. E. & Hopkins, D. C. Critical fields of superconducting tin, indium, and tantalum. Phys. Rev. 120, 88–91 (1960).
CAS Google Scholar
Berger, L.I, Roberts, B.W. Properties of Superconductors. In: CRC Handbook of Chemistry and Physics, 102nd edn, (eds. Rumble, J) Ch. 12 (CRC Press (Taylor and Francis), Boca Raton, FL) (2021).
Attia, P. M. et al. Closed-loop optimization of fast-charging protocols for batteries with machine learning. Nature 578, 397–402 (2020).
CAS Google Scholar
Zhang, Y., Apley, D. W. & Chen, W. Bayesian optimization for materials design with mixed quantitative and qualitative variables. Sci. Rep. 10, 4924 (2020).
CAS Google Scholar
Coley, C. W., Eyke, N. S. & Jensen, K. F. Autonomous discovery in the chemical sciences part 1: progress. Angew. Chem. Int. Ed. 59, 22858–22893 (2020).
CAS Google Scholar
Stach, E. et al. Autonomous experimentation systems for materials development: a community perspective. Matter 4, 2702–2726 (2021).
Google Scholar
MacLeod, B. P. et al. Self-driving laboratory for accelerated discovery of thin-film materials. Sci. Adv. 6, 8867 (2020).
Google Scholar
Ong, S. P. et al. Python materials genomics (pymatgen): a robust, open-source python library for materials analysis. Comput. Mater. Sci. 68, 314–319 (2013).
CAS Google Scholar
Wu, Z. et al. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 4–24 (2021).
Google Scholar
Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E. Neural message passing for quantum chemistry. In: Proc. 34th International Conference on Machine Learning - Volume 70. ICML’17, pp. 1263–1272. JMLR.org, Sydney, Australia https://doi.org/10.5555/3305381.3305512 (2017).
Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library. In: (eds. Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, December 8-14, 2019, Vancouver, BC, Canada, pp. 8024–8035 https://proceedings.neurips.cc/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf (2019).
Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571, 95–98 (2019).
CAS Google Scholar
Nix, D.A., Weigend, A.S. Estimating the mean and variance of the target probability distribution. In: Proc. 1994 IEEE International Conference on Neural Networks (ICNN’94), vol. 1, pp. 55–601 https://doi.org/10.1109/ICNN.1994.374138 (1994).
Lakshminarayanan, B., Pritzel, A., Blundell, C. Simple and scalable predictive uncertainty estimation using deep ensembles. In: Proc. 31st International Conference on Neural Information Processing Systems. NIPS’17, pp. 6405–6416. Curran Associates Inc., Red Hook, NY, USA (2017)
Jha, D. et al. Elemnet: deep learning the chemistry of materials from only elemental composition. Sci. Rep. 8, 17593 (2018).
Google Scholar
Hargreaves, C. J., Dyer, M. S., Gaultois, M. W., Kurlin, V. A. & Rosseinsky, M. J. The earth mover’s distance as a metric for the space of inorganic compositions. Chem. Mater. 32, 10610–10620 (2020).
CAS Google Scholar
Sinaga, G.S., Utimula, K., Nakano, K., Hongo, K., Maezono, R. First principles calculations of superconducting critical temperature of ThCr₂Si₂-Type Structure. arXiv https://doi.org/10.48550/ARXIV.1911.10716 (2019).
Todorov, I. et al. Topotactic redox chemistry of NaFeAs in water and air and superconducting behavior with stoichiometry change. Chem. Mater. 22, 3916–3925 (2010).
CAS Google Scholar
Hagino, T. et al. Superconductivity in spinel-type compounds CuRh₂S₄ and CuRh₂Se₄. Phys. Rev. B 51, 12673–12684 (1995).
CAS Google Scholar
Hiramatsu, H. et al. Water-induced superconductivity in SrFe₂As₂. Phys. Rev. B 80, 2–5 (2009).
Google Scholar
Pamuk, B., Mauri, F. & Calandra, M. High- Tc superconductivity in weakly electron-doped HfNCl. Phys. Rev. B 96, 1–7 (2017).
Google Scholar
Si, J. et al. Unconventional superconductivity induced by suppressing an Iron-Selenium-Based Mott Insulator CsFe_4−xSe₄. Phys. Rev. X 10, 41008 (2020).
CAS Google Scholar
Ying, J., Lei, H., Petrovic, C., Xiao, Y. & Struzhkin, V. V. Interplay of magnetism and superconductivity in the compressed Fe-ladder compound BaFe₂Se₃. Phys. Rev. B 95, 1–5 (2017).
Google Scholar
Yamauchi, T., Hirata, Y., Ueda, Y. & Ohgushi, K. Pressure-Induced Mott Transition Followed by a 24-K Superconducting Phase in BaFe₂S₃. Phys. Rev. Lett. 115, 1–5 (2015).
Google Scholar
McInnes, L., Healy, J., Melville, J. UMAP: uniform manifold approximation and projection for dimension reduction https://arxiv.org/abs/1802.03426 (2020).

Download references

Acknowledgements

The authors gratefully acknowledge internal financial support from the Johns Hopkins University Applied Physics Laboratory’s Independent Research & Development (IR&D) Program for funding this work. The MPMS3 system used for magnetic characterization was funded by the National Science Foundation, Division of Materials Research, Major Research Instrumentation Program, under Award #1828490. T.M.M. acknowledges support of the David and Lucile Packard Foundation.

Author information

These authors contributed equally: Elizabeth A. Pogue, Alexander New.

Authors and Affiliations

Research and Exploratory Development Department, Johns Hopkins University Applied Physics Laboratory, 11100 Johns Hopkins Road, Laurel, 20723, MD, USA
Elizabeth A. Pogue, Alexander New, Kyle McElroy, Nam Q. Le, Michael J. Pekala, Ian McCue, Eddie Gienger, Janna Domenico, Christine D. Piatko, Christopher R. Ratto, Andrew Lennon, Christine Chung, Timothy Montalbano & Christopher D. Stiles
Department of Materials Science and Engineering, Northwestern University, 2220 Campus Drive, Evanston, 60208, IL, USA
Ian McCue
Department of Materials Science and Engineering, Johns Hopkins University, 3400 N. Charles Street, Baltimore, 21218, MD, USA
Elizabeth Hedrick & Tyrel M. McQueen
Department of Chemistry, Johns Hopkins University, 3400 N. Charles Street, Baltimore, 21218, MD, USA
Tyrel M. McQueen, Brandon Wilfong & Gregory Bassen
Institute for Quantum Matter, William H. Miller III Department of Physics and Astronomy, Johns Hopkins University, 3400 N. Charles Street, Baltimore, 21218, MD, USA
Tyrel M. McQueen, Brandon Wilfong & Gregory Bassen

Authors

Elizabeth A. Pogue
View author publications
Search author on:PubMed Google Scholar
Alexander New
View author publications
Search author on:PubMed Google Scholar
Kyle McElroy
View author publications
Search author on:PubMed Google Scholar
Nam Q. Le
View author publications
Search author on:PubMed Google Scholar
Michael J. Pekala
View author publications
Search author on:PubMed Google Scholar
Ian McCue
View author publications
Search author on:PubMed Google Scholar
Eddie Gienger
View author publications
Search author on:PubMed Google Scholar
Janna Domenico
View author publications
Search author on:PubMed Google Scholar
Elizabeth Hedrick
View author publications
Search author on:PubMed Google Scholar
Tyrel M. McQueen
View author publications
Search author on:PubMed Google Scholar
Brandon Wilfong
View author publications
Search author on:PubMed Google Scholar
Christine D. Piatko
View author publications
Search author on:PubMed Google Scholar
Christopher R. Ratto
View author publications
Search author on:PubMed Google Scholar
Andrew Lennon
View author publications
Search author on:PubMed Google Scholar
Christine Chung
View author publications
Search author on:PubMed Google Scholar
Timothy Montalbano
View author publications
Search author on:PubMed Google Scholar
Gregory Bassen
View author publications
Search author on:PubMed Google Scholar
Christopher D. Stiles
View author publications
Search author on:PubMed Google Scholar

Contributions

C.D.S., I.M., C.D.P., E.G., K.M., and T.M.M. contributed to the conception of the work. E.A.P., B.W., T.M.M., E.G., and I.M. contributed to the setup and design of experiments. E.A.P., E.H., I.M., and B.W. synthesized samples. A.N created the ML model with some contributions from M.J.P., C.D.P., and C.R.R. J.D., N.Q.L., C.C., and C.D.P. contributed to preparation of the datasets and integrating them with the workflow. E.A.P., A.N., C.D.S., K.M. and T.M.M. heavily contributed to the writing and revising of the manuscript. E.H., E.A.P., E.G., C.C., T.M.M. and B.W. collected and analyzed experimental data. E.A.P., C.D.P., A.N., T.M.M., and G.B. introduced new ways of visualizing the data. C.D.S., C.R.R., and A.L. provided guidance on research direction and communication of results. C.R.R. advised the team on appropriate ML techniques for property prediction and A.L. advised the team on material structure-property relationships. C.D.S. was involved in all aspects of this project, providing overall leadership and helping to troubleshoot technical challenges.

Corresponding authors

Correspondence to Elizabeth A. Pogue, Tyrel M. McQueen or Christopher D. Stiles.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental Material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Pogue, E.A., New, A., McElroy, K. et al. Closed-loop superconducting materials discovery. npj Comput Mater 9, 181 (2023). https://doi.org/10.1038/s41524-023-01131-3

Download citation

Received: 25 April 2023
Accepted: 14 September 2023
Published: 05 October 2023
Version of record: 05 October 2023
DOI: https://doi.org/10.1038/s41524-023-01131-3

This article is cited by

Learning design-score manifold to guide diffusion models for offline optimization
- Tailin Zhou
- Zhilin Chen
- Jun Zhang
npj Artificial Intelligence (2026)
A general approach for determining applicability domain of machine learning models
- Lane E. Schultz
- Yiqi Wang
- Dane Morgan
npj Computational Materials (2025)