Introduction

Room-temperature sodium-sulfur (Na-S) batteries concerning with abundant crustal reserves of Na and S exhibit considerable theoretical specific energy (1274 Wh kg−1 calculated based on the mass of anodes and cathodes), and are promising for sustainable large-scale energy storage devices1,2. However, the Na-S batteries face severe issues of low reversible capacity and rapid capacity decay caused primarily by the shuttle effect and incomplete transformation of intermediate sodium polysulfides (NaPS) in sulfur reduction reaction3. Loading sulfur molecules in a porous carbon matrix with catalysts is regarded as an effective method to mitigate the aforementioned problems4,5. Particularly, a proper catalyst plays a crucial role in giving full play to the capacity potential of the Na-S batteries. Various catalysts (such as metals, alloys, metal compounds) have been explored to improve the performance of S cathode6,7,8,9,10. Among them, single-atom (SA) catalysts are given high expectations due to their maximized atomic utilization, superior catalytic activity and adjustable electronic structure11,12. Though the single-atom catalysts have been extensively investigated in wide fields such as oxygen reduction reactions (ORR) and hydrogen reduction reactions (HER), their potential in enhancing the performance of the Na-S batteries receives inadequate attention, making it difficult to establish a universal principle for conveniently designing the highly efficient single-atom catalysts. Moreover, directly using the single-atom catalysts available in other fields for the Na-S batteries is often indiscriminate and costly due to the distinct features of catalysts in different reaction systems. Therefore, it is both highly challenging and potentially rewarding to screen and design the optimal single-atom catalysts, as well as elucidate the catalytic mechanism for enhancing the sulfur conversion kinetics in order to develop the Na-S devices with high energy density.

Recently, deep learning has made significant advancement in natural science research, e.g., identifying suitable ionic liquids for lithium metal batteries, fabricating high-entropy alloys and predicting efficient oxygen reduction electrodes13,14,15. Natural Language Processing (NLP), as a specialized branch of deep learning, can extract valuable knowledge from literature and integrate it into high-dimensional vectors through computer training, which enables regression analysis and analogically reasoning to obtain the target information for scientific research16. Therefore, it is expected to gain insights into designing the highly-efficient single-atom catalysts for the Na-S batteries through the NLP model to reason the potential correlations in various research fields. This approach is conducive to accelerating the screening efficiency by several orders of magnitude with respect to the traditional method, and narrowing down the catalyst candidates prior to experimental verification.

Despite the profound capabilities of deep-learning as an unsupervised learning technique, the training of neural networks often lacks interpretability, resembling a ‘black box’ operation in both the training and predicting processes17. In order to promote the scientific validity and rationality of the NLP outcomes and enhance the precision of screening the potential SA catalyst candidates, it is crucial to establish an appropriate catalytic descriptor. And the ideal descriptor should have a substantial correlation with the absorption capabilities of active sites, and be able to reflect the kinetic process of the NaPS in the Na-S system18. Consequently, the construction of descriptor would lead to a more targeted refinement of the SA candidates, significantly reducing experimental and time expenditures.

This work presents a recipe that combines the NLP technique with catalyzing descriptor for quickening to find the preferable single-atom catalysts applicable in the Na-S batteries with high energy density. Our workflow (Fig. 1) mainly consists of the following components: (i) transferring the related literature into embeddings using an applicable NLP model, (ii) acquiring guidance from NLP statistical analysis to design the potential SA candidates for the Na-S system, (iii) constructing appropriate descriptor for precisely screening the efficient single-atom catalysts, (iv) experimentally verifying the representative single-atom catalysts in the Na-S batteries and anatomizing the catalytic mechanism. A more detailed workflow is displayed in Fig. S1. Through these concerted efforts, the established Na-S pouch cells with a high-loading achieve a specific energy of 333.2 Wh kg−1 based on the total weight of electrodes. In-situ X-ray absorption fine structure reveals the dynamic catalysis mechanism with wave-like charge variation process, thus providing insights into the electrocatalytic behavior within the Na-S systems. Hence, this work offers an efficacious method to bridge the NLP technique, catalyzing descriptor and experimental research to accelerate screening the effective catalysts as well as to deepen the understanding on sulfur catalytic conversion reactions especially for the future Na-S battery systems.

Fig. 1: Approach overview.
Fig. 1: Approach overview.The alternative text for this image may have been generated using AI.
Full size image

The pipeline of the project contained (i) formation of embeddings by natural language processing (NLP) model, (ii) expert analysis based on ranking and statistics, (iii) construction of binary descriptor based on density functional theory and d-band theory, and (iv) experiments tests to assess the behavior in practical sodium-sulfur (Na-S) batteries and anatomize catalytic mechanism.

Results

NLP-guided screening the efficient single-atom catalysts for the Na-S batteries

The workflow of the NLP approach in this study is shown in Fig. 2a. It involves abstract crawling through keyword retrieval, text embeddings generation employing various language models (both bidirectional and unidirectional models), and analysis on top-k nearest neighborhood papers. Constructing a textual dataset is the primary step in the NLP task. We searched and downloaded numerous abstracts from literature related to the single-atom catalysts or metal-sulfur batteries using the Science Direct Application Programming Interface (API). Subsequently, the textual dataset is transformed into high-dimensional embeddings by NLP models for further analysis, so selecting appropriate NLP models is crucial. The ablation studies were performed to validate the selection of the baseline model, which was evaluated using metrics containing accuracy, precision, f1 score, Area Under the Receiver Operating Characteristic Curve (AUROC) and Area under the Precision-Recall Curve (AUPR). The evaluation was conducted by randomly picking out part of the dataset and predicting label types based on the extracted embeddings, and the results were displayed in Fig. 2b. Even though various models denoted satisfactory performances (e.g., Llama 3.1-70B-Instruct and MatSciBERT fined-tuned by Multilayer Perceptron (MLP)), GPT-4o demonstrated the best performance in our ablation studies and was consequently selected as the baseline model.

Fig. 2: Natural language processing (NLP) method and analysis.
Fig. 2: Natural language processing (NLP) method and analysis.The alternative text for this image may have been generated using AI.
Full size image

a The workflow of the NLP approach in this study. b The radar plot for ablation studies for the baseline models, measured by accuracy, precision, f1 score, Area Under the Receiver Operating Characteristic Curve (AUROC) and Area under the Precision-Recall Curve (AUPR). c, d The statistic results of TOP-30 nearest neighbors (TOP-30-nn), c metal centers and (d). matrix selection. e t-SNE results and (f). the magnified part of the article of Na-S SA emerged in the single-atom catalysts field. The coordinating environment of single-atom catalysts was marked in magnified plot.

To study the latent connections between the single-atom catalysts applied in the Na-S batteries (Na-S SA) and other fields, representative articles concerning the Na-S SA were selected as the prediction set and used to detect their similar reports in a high-dimensional latent space (Fig. 2c–f)19. To facilitate visualization and analysis, T-distributed stochastic neighbor embedding (t-SNE) was utilized to project high dimensional embeddings onto a two-dimensional plane (Fig. 2e). Each dot in the figure represents a projection of the embeddings of one article, and the color stands for the labels. The distinct regions occupied by different categories of articles reveal the successful accomplishment of the classification task. To provide clear guidance on the selection of preferable SAs while avoiding the inclusion of excessive irrelevant information, the TOP-30 nearest neighbors (referred to as TOP-30-nn) papers are analyzed in detail.

From Fig. 2c, significant variations were noted in the occurrence numbers of different metal centers. Magnetic metal centers, especially Fe and Co, tend to appear more frequently in the TOP-30-nn articles compared with other metal centers. The Mn center also demonstrates a higher frequency of occurrence compared to others. Surprisingly, noble metal centers, which usually reveal high catalytic efficiency in various applications, are only sporadically observable. To determine whether these observed patterns are indeed non-random and potentially indicative of underlying trends, three sets of independent random sampling experiments were performed, and the results displayed a quasi-random distribution as shown in Fig. S2. Therefore, the magnetic centers with the highest occurrence frequency are considered as suitable active metal centers for sulfur reduction reaction catalysis. This proposition will be further validated in subsequent experimental work. Furthermore, within the matrix distribution of the TOP-30-nn abstracts, the carbon matrix demonstrates absolute dominance, highlighting the preference of selecting carbon matrices for SAs due to their satisfactory electrical conductivity and sufficient space for sulfur loading (Fig. 2d).

Besides the metal centers and matrix selection, the coordinating environment of the single-atom catalysts also plays a vital role in the catalytic ability. To identify the coordinated elements, we analyzed the magnified t-SNE plot of the Na-S SA category, focusing on the most correlated reports (Fig. 2f). It is well-known, the metal-nitrogen-carbon (M-N-C) structure has been extensively studied in the field of SA, revealing the highest coverage20,21. However, the most relevant SA centers are always coordinated with heteroatoms (like S, O) rather than the M-N-C structure, because the introduction of heteroatoms has been proven to be an efficient way to modulate the electronic structure of the SA centers and to further enhance the catalyzing activity22. The SA structures presented in the most relevant articles are logically recruited as the candidates for the subsequent experiments.

To investigate the diversity in model selection and the adaptability to different application domains, other advanced language models performed well in ablation studies, MatSciBERT fine-tuned by MLP and Llama3.1-70B-Instruct23, were employed to execute the analogous tasks in Na-S systems and other reaction fields. The pipelines for the catalysts screening tasks using different models are displayed in Fig. S3 and S4. GPT-4o and Llama3.1-70B-Instruct were directly utilized as both embedding extractors and generators leveraging retrieval-augmented generation (RAG) techniques to facilitate the retrieval of relevant documents from an external database, while the MatSciBERT was fine-tuned by MLP for boosting the validity of the model24,25. Consistent with statistics from GPT-4o, the TOP-30-nn papers that reflect the closest cosine distances to the foundational literature in the Na-S battery field, as predicted by MatSciBERT with MLP and Llama3.1-70B-Instruct, were thoroughly analyzed. Mirroring the results from the GPT-4o, the Fe and Co centers exhibit a higher appearance frequency than the others in both models (Fig. S5). Moreover, the research on ORR is the most frequently mentioned topic in the TOP-30-nn abstracts generated by both MatSciBERT with MLP and Llama3.1-70B-Instruct, denoting the similarity between ORR and sulfur reduction reaction due to their similar characteristics of multiple electron transfer reaction and comparable chemical reaction characteristics of congeners (Fig. S6). The previously predicted optimal coordinating environments, especially M-N/S and M-N/O, are also frequently observed in the nearest neighborhoods identified by MatSciBERT with MLP, as highlighted in the detailed t-SNE plots (Fig. S7). The concordance between the predictive results from various reliable models not only corroborates the reliability of the screening methodology using diverse models but also reaffirms the adequacy of the previously employed model for our research objectives. Additionally, we expanded our catalysts screening approach to encompass other reaction fields like OER and benzene oxidation reactions (Fig. S8–S10). These results could also serve as guidance for designing single-atom catalysts in other catalyzing fields.

Overall, the NLP models, combined with targeted expert analysis, exhibits the key elements relevant to designing the preferable single-atom catalysts for the Na-S batteries with high energy density. The expanded work, which involves substituting other models and screening additional chemical reactions, demonstrates the broad scalability of the proposed NLP methodology. Following the guidance of deep-learning models, the heteroatom-doped magnetic SA centers with adjustable localized electronic structure are identified as the qualified SA candidates for the Na-S system.

Constructing the binary descriptor for the efficient single-atom catalysts

Drawing inspiration from the prediction of the NLP technique, we could determine the potential magnetic SA candidates based on the summarized insights. In order to further filter these SA candidates and validate the rationality of the NLP outcomes, establishing an appropriate catalytic descriptor becomes crucial. For the sulfur reduction reaction process in the Na-S chemistry, the solid conversion reaction from Na2S2 to Na2S is a critical step that contributes up to 50% of the theoretical capacity, and is often the rate-determining step26,27. Therefore, lowering conversion reaction barrier is essential to facilitate the transformation process. The Gibbs free energy (ΔG) between Na2S2 and Na2S with different single-atom catalysts is considered to be a key parameter for the descriptor, and can be calculated via density functional theory (DFT) methods. As mentioned earlier, the sulfur reduction reaction is a multi-step conversion process. If the initial steps are hindered by the weak absorption between the polysulfides and active sites, the sulfur reduction reaction may not progress smoothly, even if the ΔG from Na2S2 to Na2S is low. The d-band center theory of chemisorption and catalysis, pioneered by Hammer and Nørskov, has been widely utilized, especially for metals with abundant d-band electrons which are considered to exhibit a substantial correlation with the absorption capability of active sites28,29,30,31. Though early reports on the d-band center focused on metal centers with a continuous d-band, such as metals or alloys, recent studies have demonstrated that the description of the d-band center is equally applicable to single atomic sites. This conclusion is supported by experimental evidence and machine learning-based regression analysis32,33,34,35,36,37.

Generally, the metal active sites with abundant d-band electrons around the Fermi energy demonstrate a more stable absorption status that is essential for restricting the shuttling of polysulfides and catalyzing the conversion of sulfur species38,39,40. In short, the Gibbs free energy of the capacity-determining step indicates the kinetics and completion rate of the sulfur reduction reaction process, while the d-band center is an electronic property of SA primarily associated with its absorption ability (Fig. 3a). Therefore, we establish the binary descriptor by combining the Gibbs free energy with d-band center to filter the preferable single-atom catalysts, thus guaranteeing a compact absorption status and uninterrupted catalysis throughout the whole conversion process to realize the high energy density in the Na-S batteries.

Fig. 3: Construction of specialized descriptor for sulfur reduction reaction in sodium-sulfur (Na-S) batteries.
Fig. 3: Construction of specialized descriptor for sulfur reduction reaction in sodium-sulfur (Na-S) batteries.The alternative text for this image may have been generated using AI.
Full size image

a Schematic of the descriptor with binary parameters (d-band center and ΔGNa2S2→Na2S). b The ΔG for the sodium sulfides on different SA surface. c The density of state (DOS) patterns of SA Fe-N/S, SA Fe-N, SA Co-N/S, SA Co-N and SA Ni-N/S. d The feature importance results of ΔGNa2S2→Na2S and other properties by random forest regressor. e The scatter graph of various single-atom candidates in as-built descriptors. The catalysts within the colored region of the ellipses are considered as competitive catalysts for Na-S batteries.

On the basis of the foresight of deep-learning, the elements of Fe, Co, and Ni frequently presented in the TOP-30-nn freely combine with the heteroatoms to yield coordinating structures of M-N/S, M-N/O, and M-N/F in the most similar articles which are picked out as the preferred SA candidates. Additionally, some traditional coordination structures are also selected to investigate their catalytic behavior in sulfur reduction reaction. By comprehensive analysis, SA Co-N/S, SA Ni-N/S and traditional SA Co-N are chosen as examples. The ΔG values of sodium sulfides on the catalyst surface are shown in Fig. 3b. The catalysts exhibit various ΔG values in the sulfur reduction reaction process, depending on the change in metal centers or tuning the coordination structure with S heteroatom. Specifically, when focusing on the step from Na2S2 to Na2S, it becomes evident that SA Co-N/S may exhibit the best performance on account of the lowest energy barrier. The density of state (DOS) of SA Co-N/S, Ni-N/S, Co-N, Fe-N/S and Fe-N-C structures is displayed in Fig. 3c. Fe and Co with the M-N/S structures have upper d-band centers in spin up status compared to their M-N-C counterparts, indicating a stronger absorption status. However, the d-band center in spin down status is not apparently influenced by tuning the coordination environment. Therefore, it can be concluded, the variation in the electronic structure of SA with coordinating atoms impacts the d-band center in spin up status (d-band center up) rather than spin down (d-band center down). The Random Forest Regressor further substantiates the significance of the d-band center up as a critical parameter for the electronic properties of SA in this system41, outperforming other parameters in terms of feature importance due to its significant role in projected DOS integrals and its alignment with d-band theories (Fig. 3d and Fig. S11). The original values of ΔG, surface absorption energy values of sodium sulfides on the catalysts, and d-band properties of selected SA are summarized in Table S1. The optimized model files were also provided in Data S1.

The Pearson correlation analysis was conducted to further evaluate the as-fabricated binary descriptor, as shown in Fig. S12. Apparently, the ΔG from Na2S2 to Na2S does not exhibit a distinct relationship with the d-band center in spin up status, as indicated by the low coefficient of around 0.25 (marked by the red circle in the graph). It is suggestive that the d-band center up and ΔG from Na2S2 to Na2S can be regarded as the independent variables. Moreover, a strong correlation between the d-band center and the absorption energy of most sodium polysulfides is affirmed, coherent with anticipation regarding its association with the absorption ability of SA. The scattering diagram of all selected SAs derived from the descriptor is displayed in Fig. 3e, while the DOS patterns and top views of geometric configurations of absorption for SA models are illustrated in Fig. S13-S16. The SAs with higher d-band center and lower ΔG are regarded as superior catalysts for the Na-S batteries. Following this criterion, the Co-N2O2, Fe-N4-F, Co-N4-F, and Co-N2S2 are the optimal single-atom catalysts.

As stated above, we constructed the binary descriptor to evaluate the catalytic effect of the single-atom catalysts for sulfur reduction reaction. The as-built descriptor could further be used to filter the SA candidates and verify the rationality of design guidance obtained from the deep-learning. The Fe and Co centers with specific coordination structures containing heteroatoms (O, F and S) exhibit obvious upper-shift in d-band centers and lower energy barriers from Na2S2 to Na2S, making them promising candidates for sulfur reduction reaction. Nevertheless, not all structures aforementioned could be precisely characterized due to the limitations in available techniques. For example, synchrotron radiation absorption spectrum cannot distinguish metal-oxygen and metal-nitrogen paths because of adjacent scatting amplitude42, and it also struggles to analyze the coordination environment of high-shell structures due to the interference of noise signals. This implies that some structures cannot be controllably synthesized. Considering the potential superior performance and feasibility for synthesis and characterization, we chose the SA Co-N/S as the preferable experimental catalyst. And the SA Ni-N/S and SA Co-N were also synthesized to experimentally validate the simulation results.

Structural characterization of SA Co-N/S

The SA Co-N/S, SA Ni-N/S and SA Co-N anchored to carbon fiber matrix were prepared by electrospinning technique, followed by high-temperature annealing and etching. X-ray diffraction (XRD) patterns of the carbon matrices loaded with different single-atom catalysts show only broad peaks at 25° and 45° (Fig. S17), corresponding to the (002) and (101) crystal orientations of amorphous carbon, respectively. None of the characteristic peaks of crystallized metallic cobalt and cobalt compounds were detected. The images obtained from scanning electron microscopy (SEM) in Fig. S18 exhibit 3D interpenetrated carbon fibers with smooth surfaces. The transmission electron microscopy (TEM) images reveal the presence of hollow channels in the carbonized nanofibers (Fig. 4a), and high-resolution TEM graphs and selected area electron diffraction (SEAD) in Fig. S19 further confirm the absence of any detectable crystallized metallic particles. Energy dispersive spectra (EDS) in Fig. S20–S22 demonstrate the dispersed and uniformly distributed metal and non-metal elements. The existence of SA Co could be determined by an aberration-corrected high-angle annular dark-field scanning-transmission electron microscopy (HAADF-STEM), with the bright dots scattered in the carbon matrix representing Co atoms (as marked by red circles) in the form of isolated atomic dispersion (Fig. 4b, c). For more clearness, a 2D atom-overlapping Gaussian-function image was acquired, highlighting the bright dot signals in different colors (Fig. S23). The HAADF-STEM images of SA Ni-N/S and SA Co-N in Fig. S24 and S25 exhibit similar atomic metal dispersion. The mass content of metal is about 1.8 wt%, as measured by inductively coupled plasma mass spectrometry (ICP-MS) and summarized in Table S2.

Fig. 4: Materials synthesis and characteristics.
Fig. 4: Materials synthesis and characteristics.The alternative text for this image may have been generated using AI.
Full size image

a Transmission electron microscopy (TEM) image of SA Co-N/S. b, c Aberration-corrected dark-field scanning-transmission electron microscopy image of SA Co-N/S. d X-ray absorption near-edge structure (XANES) spectra at the Co-k edge of SA Co-N/S, Co foil, CoO, Co3O4, CoPc, CoS and (e) their corresponding 1st derivative features. f Fourier transformed k2-weighted χ(k)-function of the extended X-ray absorption fine structure (EXAFS) spectra of the SA Co-N/S, Co foil, CoPc, CoS and (g) their corresponding wavelet transform contour plots. h EXAFS fitting results of SA Co-N/S.

The chemical information of the as-obtained materials was evaluated by X-ray photoelectron spectroscopy (XPS), as shown in Fig. S26, confirming the direct bonding between the metal centers and nitrogen/sulfur atoms. The X-ray absorption near-edge structure (XANES) spectra of SA Co-N/S were compared with references of Co foil, CoO, Co3O4, cobalt(II) phthalocyanine (CoPc) and CoS, as depicted in Fig. 4d. Their 1st derivative XANES in Fig. 4e indicates that the highest peak of SA Co-N/S is adjacent to Co3O4, meaning a similar oxidized state of Co43. To obtain more accurate coordination environment of SA Co-N/S, Fourier transformed (FT) K-edge extended X-ray absorption fine structure (EXAFS) curves were examined (Fig. 4f), and the k2 space signals are provided in Fig. S27. For SA Co-N/S, two main peaks emerge around 1.25 and 1.78 Å, meaning the presence of two sets of paths in the first shell of Co centers. The R space of SA Co-N/S does not show any obvious signals at 2.16 Å which is the dominant characteristics of Co-Co path in Co foil, stating the isolated dispersion of Co atoms in the sample. Wavelet transform (WT) contour plots could further infer the path patterns (Fig. 4g and Fig. S28). There are two regions showing the maximum intensity for SA Co-N/S, with the lower one resembling the Co-N path in the CoPc standard and the other resembling the Co-S path in the CoS sample. Moreover, the Co L-edge spectra in Fig. S29 reveal a noticeable lower energy shift of the Co L3 edge in SA Co-N/S compared to SA Co-N, consistent with the behavior observed in CoS and CoPc, and indicative of the stable Co-S bonding in SA Co-N/S.

EXAFS fitting was conducted to determine the specific local structure parameters of Co and Ni species. The active center of SA Co-N/S exhibits a Co-N2S2 coordinating structure with two sets of paths Co-N and Co-S (Fig. 4h). The absence of Co-Co paths further affirms the atomic dispersion state of the Co, distinguished from the indispensable fitting parameters of Co-Co path in standard samples such as Co3O4 and CoS in Fig. S30 and Fig. S31. Additionally, the fitting results for Co foil, Ni foil, SA Co-N and SA Ni-N/S are presented in Fig. S32 and Fig. S33, revealing the coordinating structures of Co-N4 and Ni-N2S2 for SA Co-N and SA Ni-N/S, respectively. A summary of the detailed fitting parameters can be found in Table S3 and S4.

Electrochemical performance of the Na-S batteries with SA Co-N/S

To examine the electrocatalytic activity of the as-obtained single-atom catalysts, we composited the carbon fiber hosts loaded with different SA categories with sulfur via melt-impregnation to serve as cathodes in the Na-S batteries. The sulfur loading mass as active materials for SA Co-N/S@S, SA Ni-N/S@S and SA Co-N@S are measured by an elemental analyzer, and the values outline in Table S5. From the Brunauer–Emmett–Teller (BET) result shown in Fig. S34, SA Co-N/S reveals a large specific surface area of 1086.4 m2 g−1 and an abundant micropore region (0.5–1.5 nm), but the specific surface area markedly decreases to 7.53 m2 g−1 for SA Co-N/S@S owing to the S molecules filling into the micro-pores44.

The galvanostatic discharge-charge curves at 0.2 A g−1 are displayed in Fig. 5a. Compared with SA Ni-N/S@S and SA Co-N@S (voltage hysteresis: ~560 mV), SA Co-N/S@S denotes a reduced polarization potential of 474 mV, indicating a more facile reaction pathway by virtue of the effective catalytic activities. The capacity of the cathode primarily derives from sulfur, and the contribution from SA Co-N/S is almost negligible, as illustrated in Fig. S35. Consequently, the voltage-relative sulfur conversion curves, corresponding to the specific mass of sulfur can be obtained, as shown in Fig. S36. Notably, the sulfur conversion of SA Co-N/S@S exceeds 93% at 0.2 A g−1, significantly outperforming that of the other two samples (85% for SA Ni-N/S@S and 74% for SA Co-N@S). Cyclic voltammetry (CV) was subsequently performed to examine the charge/discharge characteristics of the electrodes between 0.8 and 3.0 V (versus Na/Na+) (Fig. S37). The noticeable voltage hysteresis observed in the initial cycle is attributable to the formation of SEI layer, as further supported by the significant decrease in impedance due to the construction of stable SEI layer after the initial cycle in electrochemical impedance spectra (EIS) (Fig. S38). In subsequent cycles, the reversible redox kinetics reveals characteristic reduction behaviors associated with the conversion of elemental sulfur to long-chain polysulfides (such as Na2S8 and Na2S4) and subsequently to short-chain polysulfides (e.g., Na2S2, and Na2S). These transformations are evidenced by the cathodic peaks at 1.63 and 1.18 V for SA Co-N/S@S, while the oxidation peak around 2 V is ascribed to the desodiation of polysulfides45. Consistent with the galvanostatic discharge-charge curves, the SA Co-N/S@S presents the tiniest voltage hysteresis compared to SA Ni-N/S@S and SA Co-N@S in the CV plots, indicative of improved sulfur transformation kinetics.

Fig. 5: Electrochemical performance of carbon@sulfur cathode loaded with multiple kinds of single-atom catalysts under 30 °C.
Fig. 5: Electrochemical performance of carbon@sulfur cathode loaded with multiple kinds of single-atom catalysts under 30 °C.The alternative text for this image may have been generated using AI.
Full size image

a The galvanostatic discharge-charge curves of SA Co-N/S, SA Co-N, SA Ni-N/S. b Cycling performance at 0.2 A g−1 with 50% of real sulfur loading. c Rate performance by altering current rate (0.2, 0.5, 1, 2, 3, 5 A g−1) with 50% of real sulfur loading. d Long cycling performance of SA Co-N/S under 1 A g−1 with 67% of real sulfur loading. e Cycling performance of SA Co-N/S@S pouch cell with high loading mass of 4.5 mg cm−2 and the negative/positive (N/P) ratio of 3.4:1. All of the pouch cells are in 50% of real sulfur loading. f Rate performance of sodium-sulfur (Na-S) pouch cell based on SA Co-N/S@S cathode (0.1, 0.2, 0.3, 0.4, 0.5, 1.0 A g−1) and (g) cycling performance of a pouch cell (230 mAh) with high areal mass loading of 6.67 mg cm-2 and low N/P ratio of 2.13.

Precipitation experiments under constant voltage discharging mode were then carried out to elucidate catalytic effects on the transformation from polysulfides to Na2S (Fig. S39). SA Co-N/S not only displays an elevated Na2S precipitation capacity of 132.7 mAh g−1 (55.1 mAh g−1 for SA Ni-N/S and 40.8 mAh g−1 for SA Co-N), but also exhibits the fastest reaction kinetics, as evidenced by the shortest peak emergence time of 188 s (versus 334 s for SA Ni-N/S and 468 s for SA Co-N). The EIS shown in Fig. S40 demonstrate that SA Co-N/S@S has the lowest impedance and highest Na+ diffusion coefficient because of the well dispersed sulfur molecules assisted by the robust catalysts and the beneficial doping effect of nitrogen and sulfur elements in the carbon matrix. The polysulfides shuttling was also assessed by examining the shuttle current of Na-S cells with various cathodes, as displayed in Fig. S41. The incorporation of SA Co-N/S demonstrates the most significant mitigation of the shuttle effect, as evidenced by the minimal shuttle current throughout the entire test compared with that of the other SAs. The snapshots of the Na anodes after shuttle current examinations provide direct evidence that negligible sulfides emerge at the anode of the SA Co-N/S@S-based cell, in contrast to the noticeable coverage of yellow products on the surface of the Na anode of the SA Co-N@S-based cell (Fig. S42). In-situ ultraviolet visible (UV-Vis) spectra were utilized to monitor the potential dissolution of polysulfides from the SA Co-N/S@S upon cycling. As exhibited in Fig. S43, no characteristic peaks from the polysulfides are visible in the spectra throughout the entire discharging and charging processes, demonstrating the stable blocking effect of SA Co-N/S on the release of sodium polysulfide in the initial cycle. The SEM-EDS analysis of Na anodes paired with SA Co-N/S@S cathodes reveals the minimal sulfide accumulation on the anodic surface following both the 1st and 100th cycles (Fig. S44–S46), further demonstrating the effective suppression of the shuttle effect.

The introduction of SA Co-N/S could augment the conversion of sulfur to sodium sulfide, resulting in an initial specific capacity of as high as 1573 mAh g−1 under 0.2 A g−1, very close to the theoretical value of S (1675 mAh g−1) (Fig. 5b). After 100 cycles, the specific discharge capacity is around 1200 mAh g−1. In contrast, SA Ni-N/S@S and SA Co-N@S exhibit lower specific capacities and rapid capacity fading. The rate capabilities of SA Co-N/S@S shown in Fig. 5c and Fig. S47 also demonstrate the considerable reversible specific capacities of 1571, 1456, 1344, 1068, 776 and 467 mAh g−1 at 0.2, 0.5, 1.0, 2.0, 3.0 and 5.0 A g−1, respectively. With the current density reverting to 0.2 A g−1, the capacity recovers to 1519 mAh g−1, evidencing the strong adaption under extreme charging/discharging conditions. The areal capacity under different testing conditions is displayed in Fig. S48. Regarding the long-term cycling performance, it is observed that capacity degradation occurs in the initial cycles on account of the formation of cathode electrolyte interface (CEI) layer, which is a common phenomenon in sulfur cathodes45,46. Despite this, the SA Co-N/S@S cathode exhibits a high reversible capacity, maintaining 921 mAh g−1 after 450 cycles at 2 A g−1 (Fig. S49). The specific capacity of these cells significantly surpassed that reported in literature using the similar charge-discharge protocols over the same number of cycles (Table S6). The SA Co-N/S@S also exhibits superior electrochemical performance with high real mass loading of sulfur (the sulfur mass loading is calculated based on the whole electrode, rather than the carbon/sulfur composite). The SA Co-N/S@S with a real mass loading of 67% attained an impressive specific capacity of 807 mAh gcathode−1 for the 100th cycle at 0.2 A g−1 (Fig. S50), and maintained 596 mAh gcathode−1 at 300th cycle at 2 A g−1 after 10 initial activation cycles under 0.2 A g−1(Fig. 5d and Fig. S51). The improvement in capacity and cycling stability is attributed to the reliable absorption behavior and catalytic conversion ability for the capacity-determining step of SA Co-N/S. A comparative analysis of Na-S battery performances from this work and recent studies further corroborates the advancements in energy storage performance presented here, highlighting the significance of this research in the Na-S battery field (Table S7). More sustainable metal center Fe predicted by NLP-based method with N/F coordinating environment (noted as SA Fe-N/F) was also synthesized under the same electrospinning techniques and evaluated electro-catalyzing performance with the real sulfur contents of 50% and 67%, as shown in Fig. S52-55 and Table S8. The SA Fe-N/F@S with a 50% real sulfur loading demonstrates satisfactory performance of 1309.2 mAh gsulfur−1 after 100 cycles at 0.2 A g−1. Furthermore, with a 67% real sulfur loading, the SA Fe-N/F@S displays 646.7 mAh gcathode−1 at the same current density after 100 cycles, demonstrating the catalytic effectiveness even with the high sulfur loading (Fig. S56).

The post-mortem SEM, TEM and XPS analyses for the electrodes after different cycles were carried out to reveal the structural and chemical variation, as well as the failure mechanism of the cells (Fig. S57-S61). All samples for post-mortem analyses were transferred using a sample holder to prevent exposure to the air. Side reactions occurring on both the anodic and cathodic surfaces lead to the consumption of active materials and the voltage hysteresis, which can be considered one of the primary causes of electrode failure. Additionally, the disconnection of active sulfur from the carbon conductor, resulted from volumetric expansion during cycling, significantly exacerbates capacity decay. This disconnection interrupts the conductive pathway, exacerbating the loss of electrochemical activity and leading to diminished battery performance over time.

Furthermore, to validate the superior catalytic activity of SA Co-N/S, the electrochemical performance of N/S co-doped carbon matrix without metallic component (referred to as N/S doped CNF@S) and other SA centers with N/S coordinating structure (e.g., SA V-N/S@S and SA Ti-N/S@S) was also evaluated. The characterizations and electrochemical performance of these materials are presented in Fig. S62–S65. Overall, the electrochemical performance of SA Co-N/S@S significantly outperforms that of SA Ti(V) catalysts with N/S coordinating structure and N/S doped CNF@S, exhibiting higher capacities, superior rate performance, and longer cycling life.

Constructing cathodes with high areal loading to assemble pouch cells is critical for assessing the practical viability of the Na-S batteries. Initially, coin cells with a high mass loading of 4.2 mg cm-2 were assembled using a limited electrolyte volume of 40 μL. As exhibited in Fig. S66, after activating under a low current density of 0.1 A g−1 for 5 cycles, the cell operated at 0.5 A g−1 keeps a high areal capacity of 2.54 mAh cm-2 after 70 cycles with the CE close to 100%. Pouch cells were fabricated by the cathode of free-standing carbon@sulfur framework with SA Co-N/S and the anode of thin sodium metal slice. As displayed in Fig. 5f, the pouch cell with a cathode areal loading mass of ~3.0 mg cm−2 can operate at various current densities, attaining stable cycling capacities of 1159, 978.7, 915, 855, 803 and 570 mAh g−1 under 0.1, 0.2, 0.3, 0.4, 0.5 and 1.0 A g−1, respectively.

Furthermore, a pouch cell with high areal mass loading (4.5 mg cm-2), low negative/positive (N/P) ratio (3.4:1), and lean electrolyte (8 μL mg −1) was tested under 2 mA (Fig. 5e), which exhibits an initial gravimetric specific energy of 230.9 Wh kg−1 (based on the total weight of the cathode and anode) and retains 173.9 Wh kg−1 after 10 cycles. More impressively, the sulfur loading could be further increased to 6.67 mg cm-2 and the pouch cell with stricter N/P ratio (2.1:1) and less electrolyte usage (3.2 μL mg −1) attains a practical capacity of 173 mAh in the first cycle (Fig. S67), equivalent to energy densities of 333.2 and 118.3 Wh kg−1 based on the anode/cathode and the entire cell components, respectively. The typical galvanostatic discharge-charge curves are displayed in Fig. 5g, and more detailed data about the pouch cell are collected in Table S9. Our Na-S pouch cell with SA Co-N/S catalysts represents a substantial advance in terms of areal capacity and the gravimetric energy density compared with the recently reported Na-S batteries47,48,49,50, as well as other emerging battery systems such as sodium metal/ion batteries51,52,53,54,55,56,57,58, and even anode-less/free sodium metal batteries59,60 (Fig. S68 and Table S10). Moreover, the pouch cell demonstrates an advancement in total capacity among the sodium-metal anode-based batteries (Fig. S69), highlighting the tremendous potential for practical application of the Na-S batteries in the presence of the effective catalysts. More strategies can be employed to further enhance the energy density and safety of Na-S batteries, including increasing the mass loading of sulfur, reducing the N/P ratio, developing novel electrolyte and optimizing the Na anodes.

Dynamic evolution of SA Co-N/S during sulfur reduction reaction in the Na-S batteries

The electronic evolution of SA Co sites in catalyzing sulfur reduction reaction was investigated by in-situ XAS measurement. The Co L edge of SA Co-N/S and SA Co-N/S@S (Fig. 6a) manifests a change in the L3/L2 intensity ratio from 3.03 to 3.19 after loading sulfur, implying the chemical interaction with sulfur molecules and the decreased electron occupancy in the d orbital of Co sites. However, the introduction of sulfur has minimal influence on electronic structure of C and N elements (Fig. S70). Therefore, Co sites are the most favorable active sites for the sulfur reduction reaction mainly due to their strong electronic interactions with sulfur molecules. Meanwhile, the crystal orbital Hamilton populations (COHP), an effective simulation method, was utilized to evaluate the stability of chemical bonding, the larger the integral area under projected COHP (PCOHP), the stronger the anchoring effect between S and Co atoms61. Comparing with SA Ni-N/S, SA Co-N/S demonstrates the reliable absorption on sulfur molecules which is a prerequisite for catalyzing the sulfur reduction reaction (Fig. 6b and Fig. S71).

Fig. 6: In-situ X-ray absorption spectroscopy (XAS) measurement and theoretical simulation of sulfur reduction reaction process.
Fig. 6: In-situ X-ray absorption spectroscopy (XAS) measurement and theoretical simulation of sulfur reduction reaction process.The alternative text for this image may have been generated using AI.
Full size image

a Soft XAS spectra of Co L-edge for SA Co-N/S with/without loading sulfur substance b Crystal orbital Hamilton populations (COHP) patterns of Co-S bonding interactions between S molecule and Co atom in the sulfur-absorbed SA Co-N/S network. c The schematic of the coin cell for in-situ XAS studies to observe dynamic variations of active sites during sulfur reduction reaction. The synchronous radiation X-ray was directed onto the hole window of the cathode side of coin-cell, which was covered by Kapton, and the fluorescent signal could be simultaneously collected by fluorescence detector when the cell discharged at specific stages. d In-situ Co K-edge X-ray absorption near-edge structure (XANES) spectra of SA Co-N/S under different discharging stages and (e) associated evolution of the rising edge energy. f The frontier orbitals of Na2S4, Na2S2 and Na2S molecules. g The Hirsheild charge of Co sites of SA Co-N/S absorbed with various sodium sulfides representing distinct reaction stages.

The main focus of our investigation is to understand the mechanism for the SA Co-N/S catalyzing sulfur reduction reaction, so we utilized the in-situ XAS examination to identify the electronic structure and coordinating structure of SA Co-N/S under different electric potentials corresponding to different transformation stages of sodium sulfides (the testing status in galvanostatic discharge process indicated by stars in Fig. S72), as schematically exhibited in Fig. 6c. From the in-situ XANES spectra of the Co K-edge upon discharging at 0.1 C (Fig. 6d), the transition feature at the pre-edge (corresponding to the 1s-3d electron transition) around 7709.5 eV is almost invariable, manifesting a stable geometry of SA Co-N/S during discharging62. However, there is a slight energy shift in the rising edge region, suggesting the variation in electron density for the Co sites, as exhibited in the magnified XANES spectra in Fig. S73 and the corresponding energy evolution at the same normalized intensity in Fig. 6e. The energy fluctuates with successive patterns of ‘descent-ascent-descent-ascent’, thus is divided into four stages for further analysis.

To understand the cause of the complex variation at different stages, Hirshfeld charge, a DFT measurement for partial charge distribution on an atom within a molecular system63, was employed to simulate the charge number for the Co centers with different absorption statuses in sulfur reduction reaction (Fig. 6g). From SA Co-N/S in the pristine status to the absorbed sulfur molecules, the charge number of Co sites decreases. Although Co sites and sulfur have been confirmed to have chemical interactions after loading sulfur in Fig. 6a, we propose that the application of current can strengthen the absorption state between them, leading to down shift of absorption edge in XANES spectra from open circuit potential (OCP) to 1.95 V in stage I. Notably, stage I contributes little capacity, as illustrated in the galvanostatic discharge curve. In stage II, which corresponds to the first major reduction phase in CV plot (Fig. S37), the shift of the edge lines towards high-energy region suggests an increase in the valence state of Co sites, reflecting the continuous charge exchange with polysulfides from Na2S8 to Na2S4 along with the sulfur reduction reaction progress.

The second energy down shift occurs from 1.45 to 1.15 V in stage III, a significant capacity-contributing step indicated by the galvanostatic charge-discharge process. It is congruent with the charge increment from absorbing polysulfides to short-chain sulfides (i.e., from Na2S4 to Na2S2), as exhibited in Fig. 6e. In order to explain the abnormal charge transfer in sulfur reduction reaction, the molecule orbitals of Na2S4, Na2S2 and Na2S are provided in Fig. S74–S76, and their corresponding frontier orbitals are displayed in Fig. 6f. From Na2S4 to Na2S2, the lowest unoccupied molecular orbital (LUMO) shows a slight lower shift (0.1 eV), while the highest occupied molecular orbital (HOMO) varies by a much larger degree of 1.2 eV. The extraordinary upper shift of HOMO contributes to strengthening Lewis alkalinity that is eager to donate their electrons to electronic receptor. Therefore, the charge of Co sites decreases in stage III (Fig. 6g), reflecting the lower shift of energy (Fig. 6e) due to the acceptance of more electrons from Na2S2 molecules. Compared with Na2S2, Na2S molecule has both lower LUMO and HOMO orbitals, meaning the higher electronegativity. Consequently, in the step IV with discharging to 0.8 V, the short-chain sulfides fully convert to Na2S, causing the edge energy of Co sites to rise to 7716.46 eV which is slightly higher than the pristine state. More detailed analysis of the charge transfer for different sulfides is provided in Fig. S77.

Another significant factor for the Co sites restoring to their nearly initial state is the detachment of Na2S from active sites and subsequent crystallization rather than undergoing an obvious upward shift from the initial to final states (as simulated in Fig. 6g). This phenomenon can be delineated by the in-situ EXAFS spectra in R space simultaneously acquired at different potentials (Fig. S78). The apparent signals over 1.7 Å, which are identified to Co-S path, also displayed a wave-like variation of ‘ascent-descent-ascent-descent’, and the similar signal intensity observed at OCP and 0.8 V corroborates the hypothesis of the detachment of Na2S after fully discharging, thereby realizing reversible catalyzing process and preventing passivation resulted from excessive absorption.

The Hirshfeld charge of SA Ni-N/S and SA Co-N absorbed with different sodium sulfides was also simulated (Fig. S79). For SA Ni-N/S, the primary reason that the abnormal charge transfer from Na2S4 to Na2S2 does not occur in stage III is due to its higher Fermi level (-2.34 eV) than SA Co-N/S (-2.49 eV). This higher Fermi level results in a larger energy gap between the HOMO level of Na2S2, leading to a weaker interaction with Na2S2 and making it more difficult to catalyze the sulfur reduction reaction. The lowest absorption energy between SA Ni-N/S and Na2S2 gives further support for this argument (Fig. S80). For SA Co-N, despite a similar variation trend to SA Co-N/S at stage III, the Hirshfeld charge maintains constant before and after loading sulfur molecules, implying the poor interactions with sulfur molecules, as evidenced by the low pCOHP (Fig. S71) and low absorption energy with sulfur (Fig. S81). Overall, the in-situ XANES tests and Hirshfeld charge simulation reveal the charge evolution of these representative catalysts in the sulfur reduction reaction. We also believe that the charge evolution process for different catalysts in the reaction can serve as a criterion for assessing catalysis performance after more thorough investigations.

Discussion

In this work, we have developed an approach that combines deep-learning techniques with descriptor filtering to accelerate the selection of preferable single-atom catalysts for the Na-S batteries with high energy density. Particularly, our study involves comparing the embeddings of SA in the Na-S batteries with those in other widely-reported fields by utilizing the well-trained NLP models. By ranking and analyzing these embeddings, valuable information on the preferable elements, analogous fields, and design strategies for the SA in the Na-S field was extracted. Advanced LLMs were also employed for the screening tasks in both Na-S fields and other reaction systems, achieving favorable information in the screenings as well.

Then, the binary descriptor was established based on the DFT simulations and d-band theory to verify the guidance acquired from the deep-learning results and to further filter the competitive catalyst categories. Upon carefully screening, the SA Co-N/S with coordination atoms of nitrogen and sulfur was determined to be a kind of competent catalyst for the sulfur reduction reaction process in the Na-S system. The electrochemical performance of the sulfur cathode loaded with SA Co-N/S demonstrated a high specific capacity (1570 mAh g−1 at 0.2 A g−1), satisfactory rate performance, and long cycling stability. Moreover, we also achieved a high specific energy of 333.2 Wh kg−1 (Calculated based on the mass of the electrodes) for the Na-S pouch cell. The in-situ XAS measurement and electronic simulations provided deep insight into the charge transfer of SA Co-N/S in the sulfur reduction reaction process. Our work confirmed the feasibility and effectiveness of designing preferable catalysts enabled by the text-mining techniques driven by the deep-learning algorisms and reaction system descriptor. This investigation also deepens our understanding on the catalytic mechanism in the Na-S batteries. We believe that it would accelerate developing high-performance catalysts in other emerging catalytic reactions and battery systems for next-generation sustainable energy storage devices.

Methods

Data collection and preprocessing

In the field of scientific research, a vast amount of information is available in various formats, such as books, journals, and electronic versions. The initial step in building a corpus for machine learning models is to transform this diverse array of texts into a single digital format that can be easily utilized. In our study, we employed the ELSEVIER Scopus API to collect data. Non-English and review articles were removed from the dataset. Each abstract was labeled with their article topic artificially for fine-tuning process during model training. In total, 13544 abstracts were used to train the ML model.

We employed our expert knowledge of the research area to select the relevant categories that aligned with our research objectives. Correspondingly, we assigned each paper to a specific category label based on its domain, thus enabling us to create a labeled corpus. In the subsequent finetuning phase, these category labels were used as supervised training of MLP models.

Fine-tuning embeddings

We utilized the MatSciBERT model19, a materials-aware language model trained on a large corpus of peer-reviewed materials science publications, to extract meaningful information from chemical catalyst-related research papers. MatSciBERT is capable of understanding materials science-specific notations and jargons, which makes it an optimal tool for information extraction in this field.

To input an abstract into MatSciBERT, we began with encoding it into tokens using the model’s tokenizer. The tokenizer is responsible for splitting the text into tokens and encoding them into numerical format that the model can understand. Once the tokens are generated, they are then fed into the model’s encoder, which is responsible for producing a contextualized embedding that captures the essence of the text. Specifically, we set the parameter max_length=128. We then input the tokenized data into MatSciBERT, which produces embeddings for each word in the original abstract. To construct abstract-level embeddings, we stack the output of the last four layers of BERT and sum them element-wise. Formally, the formula is as follows:

$${\mbox{Abstract\_level\_embeddings}}=\sum \left({{{{\bf{H}}}}}_{{{{\bf{L}}}}-{{{\bf{3}}}}},{{{{\bf{H}}}}}_{{{{\bf{L}}}}-{{{\bf{2}}}}},{{{{\bf{H}}}}}_{{{{\bf{L}}}}-{{{\bf{1}}}}},{{{{\bf{H}}}}}_{{{{\bf{L}}}}}\right)$$
(1)

While MatSciBERT provides a powerful tool for generating contextualized embeddings of materials science texts, the embeddings of different categories may still be clustered together on the embedding space, making it difficult to distinguish between them. In order to address this issue, we leverage the category labels that were assigned during data crawling and utilize them to perform fine-tuning of the embeddings output by MatSciBERT. This fine-tuning process allows the model to more accurately distinguish between the different categories, leading to improved performance in the subsequent analysis task.

We employed a 4-layer multilayer perceptron (MLP) model to predict the abstract categories of chemical catalyst-related research papers. The computational process of layer \(l\) in MLP is described below:

$${{{{\bf{H}}}}}^{{{{\bf{l}}}}}={{{\rm{\phi }}}}\left(W{{{{\bf{H}}}}}^{{{{\bf{l}}}}-{{{\bf{1}}}}}+b\right)$$
(2)

where \({{{\rm{\phi }}}}\) is activation function like ReLU, and \(b\) is a bias vector.

The input to the MLP model is the abstract-level embedding generated by MatSciBERT, while the output is the multicategory prediction of abstract categories. We set the hidden layer dimension to 128. To ensure that our model is robust and generalizes well to unseen data, we split the crawled dataset using stratified sampling into a training set and a test set, with a ratio of 7:3 for each category of labels. We trained our MLP model using the Adam optimizer and set the learning rate to 0.01. Once the MLP training was completed, we chose the output of the third hidden layer as fine-tuned embeddings.

Usage of embeddings

To gain deeper insights into the fine-tuned embeddings, we conducted various analyses to visualize and explore their underlying patterns. To visualize the embeddings in a more interpretable manner, we employed the t-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm to reduce the high-dimensional embeddings into 2-dimensional.

The t-SNE is a commonly used dimensionality reduction algorithm, which is called t-Distributed Stochastic Neighbor Embedding. t-SNE is mainly used to map high-dimensional data to low-dimensional space for visual presentation, while preserving the similarity relationships in the original data as much as possible. The core formula of the algorithm is as follows:

$${p}_{j| i}=\frac{\exp \left(-{\left\Vert {x}_{i}-{x}_{j}\right\Vert }^{2}/2{\sigma }_{i}^{2}\right)}{{\sum}_{k\ne i}\exp \left(-{\left\Vert {x}_{i}-{x}_{k}\right\Vert }^{2}/2{\sigma }_{i}^{2}\right)},{q}_{{ij}}=\frac{{\left(1+{\left\Vert {y}_{i}-{y}_{j}\right\Vert }^{2}\right)}^{-1}}{{\sum}_{k\ne l}{\left(1+{\left\Vert {y}_{k}-{y}_{l}\right\Vert }^{2}\right)}^{-1}}$$
(3)

where \({p}_{{j|i}}\) denotes the similarity between point \(i\) and point \(j,{{{{\rm{\sigma }}}}}_{i}\) is the bandwidth parameter of the Gaussian kernel, \({q}_{{ij}}\) denotes the similarity between point \(i\) and point \(j\) after mapping, \({y}_{i}\) and \({y}_{j}\) denote the coordinates of point \(i\) and point \(j\) in the low-dimensional space, and \(k\) and \(l\) denote the indexes of all other points. The goal of t-SNE is to find the optimal low-dimensional coordinates by minimizing the KL scatter between \({p}_{{j|i}}\) and \({q}_{{ij}}\) so that similar points are closer together in the low-dimensional space. The resulting visualizations were then plotted and shown in Fig. 2, which demonstrate the clear separation between different categories of abstracts, indicating the effectiveness of the fine-tuning process.

Moreover, to quantify the similarity between the abstract embeddings, we calculated their cosine distance and ranked them accordingly. This allowed us to identify the Top-K similar abstracts, which we further analyzed to gain a more nuanced understanding of the relationships between different research topics. Overall, these analyses provided valuable insights into the fine-tuned embeddings, enabling us to better understand the semantic relationships between different abstracts and their underlying research topics.

Usage of LLMs

Two prominent large language models (LLMs) were involved in this study: Llama and GPT. Llama3.1 was utilized through the Ollama platform, specifically employing the 70B-Instruct model variant, while the GPT accessed via OpenAI’s API including text-embedding-3-large, and GPT-4o. The prompts utilized in this study were exhibited in the source code.

Computational details

The dataset of the descriptor relies on density functional theory (DFT) calculation results that were finished by Vienna Ab initio Simulation Package (VASP)64. A generalized gradient approximation (GGA) functional revised Perdew-Burke-Ernzerhof from Nørskov (RPBE) was used with projector augmented wave (PAW) pseudopotential65,66. The cutoff energy was well-tested, and 500 eV was setup for all calculation. In order to correctly reflect the dispersion interaction, Grimme’s DFT-D3 correlation was used. For isolated molecules, all calculation was setup with Gamma only. For slab model, a less than 0.04 (2π/Å) k-mesh was used for relaxation and frequency calculation, and a less than 0.02 (2π/Å) k-mesh was used for static calculation.

Furthermore, dipole correction was added during static calculation so as to offset the energy error caused by periodic dipole moment (in all directions for isolated molecules, and in vacuum layer direction for slab model). VASPKIT67 and VMD68 were used for post-processing and visualization. Reduced density gradient69 (RDG) analysis was calculated by Multiwfn70, and pro-molecular approximation was used. Crystal orbital Hamilton populations61 (COHP) was calculated by LOBSTER package71. The machine-learning techniques containing principle component analysis (PCA), Random Forest Regressor was acquired from Scikit-learn package to execute dimensionality reduction and feature importance analysis task72.

Chemicals

The polyacrylonitrile (PAN) (Mw=150000), and polystyrene (PS) (Mw=280000) were purchased from Sigma. Cobalt nitrate (Co(NO3)2‧6H2O, > 99%) and nickel nitrate (Ni(NO3)2‧6H2O, > 99%) were supplied by Adamas. Trithiocyanuric acid (TTCA) and cyanoguanidine were obtained from Aladdin. Sulfur (>99.9%), potassium hydroxide, hydrochloric acid and N, N-dimethylformamide (DMF), sodium metal (> 99.5%) were from HUSHI.

Preparation of SA Co-N/S@S, SA Co-N@S and SA Ni-N/S@S

Firstly, 0.10 g Co(NO3)2‧6H2O (or Ni(NO3)2‧3H2O), 0.35 g TTCA, 0.30 g polystyrene (PS, Mw=280000) and 1.0 g polyacrylonitrile (PAN, Mw =150000) were added into 10 mL DMF under magnetically stirring at 40 °C until all of precursors completely dissolved in DMF. Secondly, the mixture was transferred into a plastic syringe with a metallic needle. The nanofiber materials were synthesized via electrospinning process under the positive voltage of 18 kV and the pumping rate of 6 uL min-1. Thirdly, the obtained nanofiber films were peroxided at 250 °C for 3 h in a muffle furnace and subsequently annealed at 800 °C for 2 h under argon atmosphere in a horizontal tube furnace. Afterwards, the annealed fibers were soaked in KOH aqueous solution for 10 h and then etched under 800 °C for 30 min in argon atmosphere. The resultant materials were immersed into 2 M HCl solution to remove Fe clusters. The obtained SA Co-N/S film was punched into small circular foils with a diameter of 10 mm, which can be directly used as the host for S. The SA Co-N was synthesized in an analogical way by changing TTCA with equal amount of cyanoguanidine. The SA Ti-N/S and SA V-N/S were also synthesized using similar methods, replacing Co(NO3)2‧6H2O with equal mass of titanium acetylacetonate and vanadium acetylacetonate, respectively. The N/S doped CNF was obtained from aforementioned method without adding metal salt precursors in DMF.

The prepared carbon fiber hosts with different single-atom catalysts then came through sulfur infiltration process. 50 mg of S powder was dissolved into 3 mL of carbon disulfide (CS2) with stirring, and the equal mass of carbon films was immersed into solution until entire volatilization of CS2. The mixture of S and the carbon films was sealed into an autoclave and heated at 155 °C for 12 h to obtain the SA Co-N/S@S, SA Co-N@S, SA Ni-N/S@S, SA Ti-N/S@S, SA V-N/S@S, and N/S doped CNF@S with about 50 wt% sulfur as cathodes for RT Na-S battery tests. The SA Co-N/S@S with a real sulfur content of 67% was obtained using a similar method by adjusting the mass ratio of SA Co-N@S: sulfur to 1:2.

Electrochemical assessment

The electrochemical performance was explored by assembling coin cells. The SA Co-N/S, SA Co-N and SA Ni-N/S films were directly utilized as the cathode accompanied with 10 mm-diameter Na foil as the anode (prepared using a circular punch), and 16 mm-diameter GF/A membrane as the separator to build the 2032-type cells in an argon-filled glove box (H2O < 0.01 ppm O2 < 0.01 ppm) under 25 °C. Cycling tests for the Na-S batteries were conducted within a potential window of 0.8─3.0 V. All the cells were tested under the constant temperature of 30 °C after aging for 3 h. The average mass of cathode films used in this work was controlled at around 0.45 mg. The cells with 4.2 mg cm-2 sulfur loading were acquired by overlapping SA Co-N/S films. A solution of 2 M sodium bis (trifluoromethylsulfonyl) imide (NaTFSI) in propylene carbonate (PC)/fluoroethylene carbonate (FEC) (1:1 by volume, 60 μL) was utilized as the electrolyte for RT Na-S cells. The assembly of in-situ cells is identical to that of conventional cells, except for the addition of an extra 100 μL of electrolyte to ensure wettability of the electrode during in-situ testing. As for pouch cell, the SA Co-N/S film was tailored into 30×30 mm2 pieces as cathode. A piece of 30 × 40 mm2 Na foil was applied as anode. The pouch cell was assembled in the Ar-filled glove box by piling up SA Co-N/S, glass fiber, and Na foil and then adding 500 μL electrolyte. Later, the vacuum encapsulation process was performed in the glove box. All the capacity values were calculated based on the mass of S. The galvanostatic charge-discharge tests were conducted on the Neware BTS-610 instrument. The cyclic voltammetry (CV) measurements and electrochemical impedance spectrum (EIS) tests were performed on the CHI 660D workstation. In the frequency range of 0.01–100 kHz, EIS measurements are typically conducted using a potentiostatic signal with an amplitude of 20 mV and 12 data points per decade frequency. Prior to the EIS experiment, a quasi-static potential is usually applied near the open-circuit potential (OCP).

XAS measurement

XANES spectra at the Co L-edge, N K-edge and C K-edge were tested by the soft X-ray magnetic circular dichroism end station (XMCD) of National Synchrotron Radiation Laboratory (NSRL). XAFS spectra at the Co and Ni K-edge were measured at Canadian Light Source (CLS) with the beamline of Hard X-ray Micro-Analysis Beamline (HXMA). The in-situ XAS examinations were proceeded utilizing BL14W and BL11B beamline in the Shanghai Synchrotron Radiation Facility (SSRF). The Fe K-edge XAFS data were recorded in a fluorescence mode. Fe foil, FeO, Fe2O3, FePc and FeS were used as references. The acquired EXAFS data was processed via Athena module implemented in the IFEFFIT software73. The k2 and k3 weighted EXAFS were obtained by subtracting the pre-edge and post-edge background from the overall absorption and then normalized with respect to the edge jump step in 1st derivative line. Subsequently, the χ(k) data was then Fourier transformed to R space, using a hanning window (dk=1 Å-1) to separate EXAFS conditions from different coordination shells. XAFS fitting was proceeded using ARTEMIS software. The EXAFS equation was listed:

$$\chi \left(k\right)={\sum}_{j}\frac{{N}_{j}{{S}_{0}}^{2}{F}_{j}(k)}{k{{R}_{j}}^{2}}\exp \left(-2{k}^{2}{\sigma }^{2}\right)\exp \left(\frac{-2{R}_{j}}{\lambda \left(k\right)}\right)\sin \left[2k{R}_{j}+{\Phi }_{j}\left(k\right)\right]$$
(4)

S02 is the amplitude reduction factor, Fj(k) is the effective curved-wave backscattering amplitude, Nj is the number of neighbors in the jth atomic shell, Rj is the distance between the X-ray absorbing central atom and the atoms in the jth atomic shell, λ is the mean free path in Å, \(\Phi\)j(k) is the phase shift (including the phase shift for each shell and the total central atom phase shift), σj is the Debye-Waller parameter of the jth atomic shell (variation of distances around the average Rj). The relevant parameters were calculated with the ab initio code FEFF 674.

The Wavelet transformed (WT) of EXAFS was performed via HAMA software75. The R space is between 0-6 Å. The calculating pattern is Morlet (kappa Morlet κ = 10, the sigma Morlet σ = 1).