Introduction

Compared to conventional carbon steels, weathering steels (WS) exhibit excellent atmospheric corrosion resistance, enabling long-term service without the need for protective coatings1,2,3,4,5. This significantly reduces maintenance costs and enhances environmental sustainability, making WS widely used in bridge construction, building enclosures, and rail transit applications6. The superior corrosion resistance of WS primarily originated from the addition of trace alloying elements such as Cu, Cr, and Ni, which promote the formation of a dense and protective rust layer during corrosion, effectively inhibiting further degradation of the steel substrate7,8,9. However, as service environment becomes increasingly complex, higher requirements are placed on the comprehensive performance of existing WS, necessitating not only excellent corrosion resistance but also superior strength and toughness. Current WS systems often face challenges in achieving an optimal balance among these properties, limiting their application under harsh service conditions10,11. Therefore, enhancing the mechanical properties of WS while preserving its excellent corrosion resistance has become a key focus of current research.

To date, the main approaches to enhancing the mechanical properties of WS have included microalloying, microstructural control, and surface treatment12,13,14,15. Among these strategies, microalloying is one of the most widely adopted approaches. By introducing strengthening elements, it regulates phase composition and evolution, thereby improving mechanical properties at the microstructural level. Currently, the main compositions of WS are categorized into two systems: high-Cr (0.5 ~ 3 wt.%) and high-Ni (0.5 ~ 3.5 wt.%) alloys7,16,17,18. The former significantly reduces the corrosion rate by forming a dense inner rust layer composed of Cr oxides, while the latter inhibits corrosion by promoting the nucleation of α-FeOOH within the rust layer19,20. In addition, the addition of other microalloying elements also contributes to the improvement of the mechanical properties of WS. It has been found that the addition of 0.03 wt.% Ce to a WS containing 0.5 wt.% Cr increases its tensile strength from 853 MPa to 988 MPa without significantly affecting elongation21. The simultaneous addition of Al and Mn can significantly enhance the corrosion resistance of lightweight WS, while raising the tensile strength from 523 MPa to 650 MPa and the elongation from 24.26% to 39.71%22. Furthermore, increasing the Cr content by only 0.56 wt.% can reduce the corrosion sensitivity of WS and improve its tensile strength from 470 MPa to 550 MPa23.

Although recent studies have made notable progress in enhancing the strength and corrosion resistance of WS, significant trade-offs between cost and performance still exist in alloy design. For example, when the Cr content exceeds 2.5 wt.%, both the depth and density of pitting corrosion increase sharply24,25. Although Ni can effectively enhance both the mechanical properties and corrosion resistance of WS, its high cost means that excessive addition would significantly increase the overall material cost26. In addition, certain elements that improve ductility, such as Nb and Mo, also present cost sensitivity issues, further exacerbating resource constraints in alloy design. Therefore, more precise control of alloy composition is required to simultaneously enhance performance and control costs, achieving multi-objective optimization. However, due to the complexity of alloy systems, lengthy preparation processes, and diverse performance targets, establishing the intrinsic relationships between composition and properties remains a major challenge. In practice, alloy design often relies on a combination of empirical knowledge and trial-and-error approaches, which frequently leads to significant improvements in one property (e.g., strength) at the expense of another (e.g., corrosion resistance)27, making it difficult to achieve coordinated optimization across multiple properties. Consequently, there is an urgent need for an intelligent design strategy that balances performance and cost while enabling rapid decision-making within complex alloy systems, thereby advancing the development of WS toward high strength, high ductility, high corrosion resistance, and low cost.

The application of machine learning in materials design has flourished, significantly accelerating the discovery and development of new materials28. As a key approach to overcoming the limitations of traditional design methods, data-driven machine learning techniques offer new strategies for addressing the complex challenges of alloy composition design. Recent study has shown that the simultaneous enhancement of strength, toughness, and stress corrosion resistance in aluminum alloys, with values of 760 MPa, 13.3% and 34.9 MPa·m1/2 respectively, can be achieved through machine learning combined with feature selection29. In other studies, a set of critical features was identified by applying correlation filtering, recursive elimination, and thorough dataset screening, leading to the design of Cu alloys with both high mechanical strength and electrical conductivity30,31. Furthermore, a multi-objective feature optimization strategy integrating elemental descriptor mining has been developed for the design of high-entropy alloys with optimized strength and ductility32.

However, machine learning models still face several limitations, particularly under conditions involving high-dimensional inputs, small training datasets, and complex feature interactions. Traditional models often rely heavily on feature engineering and expert knowledge, making it difficult to fully capture the nonlinear relationships between alloy composition and performance. To overcome these limitations, deep learning, with its powerful feature extraction and predictive capabilities, has increasingly been introduced into the field of materials design33,34,35,36. A multimodal deep learning framework was proposed that enables accurate prediction of mechanical properties by inputting material composition and processing text information37. The framework employs Transformer architecture, one of the most representative models in the field of deep learning, which demonstrates powerful modeling capabilities in handling multi-source, heterogeneous, and high-dimensional feature information. At the core of the Transformer is the attention mechanism38, which can automatically identify key variables in the input without manual specification and dynamically allocate weights to focus on features that have a greater impact on target properties, thus achieving unified modeling and feature selection. In addition, it effectively captures complex interactions among elements, improves the accuracy of property predictions, and provides interpretable outputs based on feature importance39. This mechanism offers a transparent pathway for identifying the key drivers behind material performance and holds great promises for enabling more efficient and intelligent design and optimization of high-performance alloys. Table 1

Table 1 Key physicochemical characteristic factors affecting properties of WS

In this study, an alloy composition design method based on the attention mechanism is proposed to achieve the simultaneous improvement of corrosion resistance, tensile strength, and ductility in a newly developed WS, while maintaining material costs within a controllable range. By leveraging the attention mechanism to identify key physicochemical factors and integrating a utility function for efficient screening and iterative optimization of candidate compositions, the predicted results were experimentally validated, and the underlying mechanisms were analyzed. This work not only offers a generalizable low-cost, multi-objective alloy design strategy but also highlights the potential and practical value of the attention mechanism in high-dimensional materials performance modeling.

Results

Dataset of WS

Figure 1 presents the distribution of the composition-property dataset collected in this study. Analysis of the distribution of 14 microalloying elements across the three performance datasets, namely corrosion resistance, strength, and ductility, revealed that the sample coverage rates for Si, P, Mn, and C reached 92%, 88%, 85%, and 78%, respectively, indicating their widespread presence as essential elements for corrosion resistance. In contrast, traditional corrosion-resistant elements such as Cr, Ni, and Cu exhibited more moderate sample coverage, ranging between 54% and 67%.

Fig. 1: Elemental distribution in the composition-property datasets.
figure 1

a Corrosion rate, b strength and c ductility. 1 Represents element type distribution, 2 represents box plots of element content distribution, 3 represents Axis plots showing the performance distribution.

Distinct differences in elemental distribution characteristics were observed between the strength and ductility datasets. The analysis of composition fluctuation showed that the Cr content spanned from 0 to 9.2 wt.% in the corrosion resistance dataset and from 0 to 17 wt.% in the strength dataset, while Mn content in the ductility dataset ranged broadly from 0 to 18 wt.%. The wider range of Mn content in the elongation dataset reflects the inclusion of alloys with intentionally varied Mn concentrations to assess its influence on ductility. All data were screened for chemical consistency, and the variation is within the expected compositional design space.

The distribution patterns of performance indicators further validated the representativeness of the dataset. The corrosion rate exhibited a right-skewed distribution, with 66% of the samples concentrated within the range of 0.8 ~ 3.2 g/(m2·h). In the strength dataset, 58% of the samples exhibited an ultimate tensile strength (UTS) in the range of 450~750 MPa, covering strength requirements from structural components to load-bearing parts. The elongation data displayed a wide, uniform distribution from 0.5% to 42%, without significant clustering, reflecting a strong process dependence in plasticity regulation.

These statistical results provide critical support for subsequent alloy design and performance prediction. The study reveals that material properties exhibit significant differences in their responses to elemental synergistic interactions, with elements showing greater compositional dispersion, such as Cr, Ni, and Mn, demonstrating higher potential for performance tuning. Based on the concentrated distribution characteristics of the main performance indicators, the optimization boundary conditions established in this study require strict control of the corrosion rate below 0.8 g/(m2·h) and a minimum UTS threshold of 750 MPa, while simultaneously achieving improvements in elongation.

Selection of key physicochemical characteristic factors

After clarifying the elemental distribution patterns and the statistical characteristics of performance indicators across the three performance datasets, further importance evaluation of various physicochemical features was conducted.

First, among the 66 physicochemical features, the top 15 features with the highest attention weights were selected. In Fig. 2, the blue section represents the initially screened feature pool, while the gray section corresponds to irrelevant features with attention weights below the threshold. The physicochemical features most strongly influence corrosion performance were identified as E43, A18, E1, A27, and others. The features with the greatest influence on tensile strength included C48, C40, A29, C21, among others. For elongation, the most influential features were E31, E46, C40, R64, and so on. These initially selected physicochemical features were then subjected to further redundancy elimination.

Fig. 2: Attention-derived weights of the 66 physicochemical features.
figure 2

Higher bars indicate greater importance of the corresponding feature in predicting each target property.

To further simplify the features, the Pearson correlation coefficients between the initially screened 15 physicochemical characteristic factors and the three target properties were calculated, as shown in Fig. 3(a1-c1). Overall, different physicochemical factors exhibited distinctly varied responses to the three performance metrics. Modulus-related factors showed a positive correlation with strength, while lattice structure and electronic property parameters more strongly influenced ductility and corrosion behavior.

Fig. 3: Selection of key physicochemical characteristic factors.
figure 3

a Corrosion rate, b strength and c ductility. 1 Represents Pearson correlation heatmaps; 2 represents feature selection process; 3 represents screening results of key factors; 4 represents prediction effects.

Based on the correlation analysis results, the feature set was further reduced by retaining only the variables highly sensitive to target properties, thereby minimizing redundant inputs and enhancing the stability and interpretability of subsequent model training. Physicochemical factors with Pearson correlation coefficients exceeding 0.95 were excluded, namely C12 and E42, E35, C39 and D4, and C40, C64, E15, and E13, as shown in Fig. 3(a2-c2). After eliminating redundant features, the remaining key physicochemical factors were further screened to identify those most decisive for each target property. Various combinations of the retained alloy factors were constructed and used for model prediction. The prediction results using the selected key physicochemical characteristic factors are shown in Fig. 3(a3-c3). For corrosion rate prediction, the variables A18, E1, and E3 yielded the highest coefficient of determination (R2 = 0.87). For UTS prediction, C48, E43, and R25 achieved the highest R2 value of 0.86. For elongation prediction, D59, C48, and R6 yielded the highest R2 value of 0.86. Ultimately, three key physicochemical factors were retained for each target property.

The distributions of training and testing datasets are illustrated in Fig. 3 (a4-c4), where a clear linear relationship and low dispersion between predicted and actual values further validate the effectiveness of the key factor selection process.

To further interpret the specific effects of key physicochemical characteristics on corresponding performance metrics and to achieve accurate prediction of target alloy properties, an additional analysis was conducted based on the previously performed Pearson correlation results. The weight distribution of the selected key physicochemical factors within the attention mechanism model was analyzed to quantify their relative contributions during the nonlinear modeling process.

For corrosion rate prediction, the most influential physicochemical factors were atomic mass (A18), interfacial energy (E1), and standard electrode potential (S3). Among these, atomic mass (weight: 0.384) and interfacial energy (weight: 0.348) exhibited positive correlations with corrosion rate, whereas standard electrode potential (weight: 0.267) showed a negative correlation. Specifically, atomic mass demonstrated a positive correlation with corrosion rate, where heavier elements such as Sn and Sb tend to form loose rust layers and display uneven distribution in the alloy, which can induce grain boundary segregation and local corrosion40. In contrast, higher interfacial energy facilitates the formation of dense rust layers, thereby improving rust layer stability41. Elements with lower standard electrode potentials, such as Mn, Ti, and V, are preferentially oxidized, resulting in loose, poorly adherent corrosion products that hinder the formation of protective films and accelerate matrix corrosion1,42,43.

For mechanical properties, the key physicochemical factors influencing tensile strength were identified as modulus compression (C48), enthalpy of fusion (E43), and boiling point temperature (R25). Among them, modulus compression (weight: 0.397) and enthalpy of fusion (weight: 0.311) showed positive correlations with UTS, while boiling point temperature (weight: 0.292) exhibited a negative correlation. Elements with stronger atomic bonding and higher thermal stability contribute to enhanced lattice rigidity and improved load transfer efficiency, thereby increasing material strength44. In contrast, boiling point temperature demonstrated a negative correlation with UTS. Although elements with higher boiling points possess better thermal stability, they tend to promote the formation of coarse second phases or brittle compounds45, which hinder plastic deformation compatibility and increase the risk of brittle fracture46. Additionally, such elements generally exhibit poor diffusion capabilities, leading to microstructural heterogeneity that ultimately reduces the overall material strength47.

For ductility, the primary influencing physicochemical factors were identified as the number of slip systems (D59, weight: 0.390), modulus compression (C48, weight: 0.323), and Poisson’s ratio (R6, weight: 0.287), all of which showed positive correlations with elongation. A greater number of slip systems facilitates plastic deformation under multiaxial stress, thereby enhancing ductility44. Therefore, incorporating elements that stabilize the FCC structure can increase the number of active slip systems and promote coordinated multi-slip deformation. Elements with higher modulus compression contribute to improved crystal structure stability and stronger atomic bonding rigidity, which help suppress microcrack formation and delay fracture, while also enhancing stress transfer and local deformation compatibility. Poisson’s ratio reflects the coordination between transverse and axial deformation; a higher value can effectively delay necking and crack propagation, thus improving overall ductility48. In summary, the addition of elements with high modulus compression, high Poisson’s ratio, and the ability to promote slip activity is an effective strategy for improving elongation.

Attention mechanism driven alloy composition design

After identifying the key physicochemical descriptors that contribute most significantly to each target property, it is essential to trace these features back to their elemental origins to support practical alloy design. To this end, the individual contributions of elements to the selected key features were visualized, as shown in Fig. 4. From Fig. 4a1-a3, it can be seen that Sb, Mo, and C are the primary contributors to the corrosion resistance of the steel, while C and Nb notably influence the UTS performance (Fig. 4b1-b3). In addition, C, Mo, and Nb also play significant roles in determining the ductility of the steel (Fig. 4c1-c3). These results provide a basis for subsequent compositional tuning and multi-objective optimization.

Fig. 4: Assessment of the significance of various elements.
figure 4

a Corrosion rate, b strength and c ductility. 1-3 represents different key physicochemical factors.

After clarifying the differences in elemental performance across key physicochemical dimensions, it is necessary to further explore whether synergistic or antagonistic relationships exist among elements during the multi-objective optimization process. To this end, an elemental response matrix was constructed to evaluate a large number of binary element combinations, with performance scoring based on the comprehensive material evaluation index (CMEI) defined in Eq. 8. For each element pair, multiple composition variants were generated, and their corresponding mechanical properties, corrosion resistance, and economic index were predicted using the trained attention-based model. The predicted values were then substituted into Eq. 8 to compute a single CMEI score for each composition.

The average CMEI scores across all tested compositions for a given element pair were visualized in Fig. 5 to assess the degree of elemental compatibility. Initially, the score distribution appeared relatively uniform with minimal contrast, suggesting that the model was still exploring potential interaction patterns among elements. As the number of evaluated combinations increased, the score distribution gradually became more concentrated, revealing consistently high-scoring element pairs such as Mn-Cr and Cr-Ni49. This indicates that the model progressively converged toward synergistic elemental combinations that contributed most significantly to overall property improvement. Conversely, combinations such as Al-Nb and Cu-Mo exhibited persistently low CMEI scores, suggesting possible antagonistic effects. The transition of the heatmap from a dispersed to a focused pattern demonstrates the model’s ability facilitated by the attention mechanism to detect underlying elemental interactions and guide composition-level optimization in a data-driven manner.

Fig. 5: Element combination screening matrix heatmap during the iterative process.
figure 5

As the iteration progresses from top-left to bottom-right, the heatmaps highlight element pairs with increasing suitability scores, indicating converging selection preferences.

To explore the trade-offs and interactions among strength, ductility, corrosion resistance, and cost, a multi-objective optimization process based on the CMEI utility function was performed. A total of tens of thousands candidate compositions were generated within the design space, and their corresponding performance metrics were predicted using the trained model. The predicted results were used to construct a multidimensional performance space for visualization and trend analysis.

After identifying the high-compatibility elemental combinations, a performance driven utility function was established to construct a comprehensive performance evaluation and screening system for candidate alloy compositions. To intuitively illustrate the optimization trends of the iterative results within the multi-performance target space, radar charts of performance indicators were plotted, as shown in Fig. 6. The figure presents the performance distributions of representative alloy samples obtained through multiple iterations across four dimensions: tensile strength, ductility, corrosion rate, and economic cost. The lighter colored contours correspond to the performance prediction results from intermediate iterations, representing the optimization path in which the attention mechanism continuously learned within the performance space, dynamically adjusted compositions, and progressively converged. Compared with the initial design (deep blue area), the final optimized results (red area) show a significant expansion in overall area, indicating that the designed alloys achieved a substantial reduction in corrosion rate while maintaining excellent mechanical properties (UTS and ductility) and improving overall performance without significantly increasing costs. These results validate the rationality of the constructed utility function and further demonstrate the model’s practical capability in multi-objective trade-off and integrated performance optimization.

Fig. 6: Radar chart of the iterative performance evolution for WS.
figure 6

The radar chart compares the multi-objective performance of WS before and after optimization, showing notable improvements across all metrics.

After completing the performance scoring and screening of candidate alloy compositions, a further analysis was conducted from the perspective of multidimensional performance space to reveal the relative positioning and distribution characteristics of different alloy combinations.

To achieve this, a three-dimensional scatter plot was constructed, and two-dimensional projections were used to analyze the correlation trends among different performance metrics. Figure 7b indicates that as the corrosion rate increases, the strength distribution range becomes more concentrated, with most samples exceeding 800 MPa, suggesting that certain high-strength combinations may compromise corrosion resistance. In Fig. 7c, the relationship between elongation and corrosion rate appears relatively scattered; however, in the low corrosion rate region (<1.5 g/(m2·h)), elongation exhibits a broader distribution, implying considerable potential for simultaneously achieving high plasticity and high corrosion resistance. Figure 7d shows a clear negative correlation trend between strength and ductility, consistent with the classical strength-ductility trade-off principle50, with the limitation on elongation being particularly pronounced in the high-strength region (>1000 MPa). This observation indirectly verifies the effectiveness and reliability of the attention mechanism in modeling multi-performance relationships. Regarding economic considerations, high-strength combinations tend to be more costly, while high-ductility combinations are relatively less expensive. No significant correlation was observed between corrosion resistance and cost, indicating that improvements in corrosion performance do not necessarily rely on the incorporation of expensive elements.

Fig. 7: Multi-objective optimization results.
figure 7

a 3D three-dimensional diagram, b 2D projection of UTS vs. corrosion rate; c 2D projection of elongation vs. corrosion rate; d A 2D projection of UTS vs. elongation. The orange star denotes the optimal alloy design obtained from multi-objective optimization.

Experimental validation and microstructural analysis

To verify the prediction accuracy and practical applicability of the model constructed based on the attention mechanism, performance tests were conducted on the newly designed WS, and the results were compared with the model predictions. The performance testing curves, and experimental results are shown in Fig. 8, while a detailed comparison between the target compositions, model-predicted properties, and experimental results is provided in Table 2. In terms of composition, the experimental alloy closely matched the model-recommended values, indicating that the newly fabricated WS effectively reproduced the designed composition. This ensures that the subsequent performance tests are representative and provide strong validation of the model.

Fig. 8: Performance of the newly developed WS.
figure 8

a Engineering stress-strain curve and tensile test results; b corrosion rate.

Table 2 Predicted and measured composition, predicted and measured properties of the newly developed WS

For the mechanical properties, the tensile curves shown in Fig. 8a indicate that the designed steel exhibits high strength along with relatively good plastic deformation capability. The experimentally measured UTS was 808 MPa, slightly lower than the model-predicted value of 837 MPa, while the measured elongation was 20.3%, slightly higher than the predicted value of 20.1%. Both measured values fall within the reasonable error range of the model predictions, validating the effectiveness of the attention mechanism based model in predicting mechanical performance.

Regarding corrosion resistance, the corrosion rates of the newly designed WS and the commercially available Q345NH steel were tested at different exposure durations (72 h, 144 h, 288 h, and 576 h). The results showed that the corrosion rates of both steels decreased progressively with increasing exposure time. However, the newly designed WS exhibited significantly lower corrosion rates compared to Q345NH steel, with a corrosion rate reduction by a factor of two at 576 h.

This trend is consistent with the model’s predictions for corrosion resistance, indicating that the designed steel possesses excellent long-term corrosion stability. Furthermore, analysis of the error bars reveals that the newly developed WS displayed smaller variability, suggesting more stable performance.

Overall, the experimentally measured mechanical and corrosion properties were highly consistent with the model predictions, further confirming the accuracy and engineering applicability of the attention mechanism-based steel design approach in practical steel development.

To further elucidate the microstructural origins underlying the superior comprehensive performance of the newly developed WS and to clarify the structural basis for its mechanical and corrosion properties, a systematic microstructural analysis was conducted, as shown in Fig. 9.

Fig. 9: Microstructural characterization of the newly developed WS.
figure 9

a IPF map; b grain boundary distribution map; c KAM map; d grain boundary misorientation angle distribution; e grain size distribution; f etched metallographic structure.

The inverse pole figure (IPF) map shows that the grains are equiaxed and the color distribution is uniform, indicating that the grains have varied orientations without a strong texture and exhibit good isotropy. The grain boundary map reveals an interlaced distribution of high-angle and low-angle grain boundaries with regular boundary morphology, further confirming the stability of the microstructure and the effectiveness of the heat treatment process. The kernel average misorientation (KAM) map shows a dense distribution of green regions, suggesting the presence of local orientation differences within the grains, which likely arise from residual stresses or substructure formation after deformation. Such localized strains are beneficial for promoting stress transfer while maintaining ductility, thereby enhancing the overall mechanical performance of the material.

Figure 9d presents the grain boundary misorientation angle distribution, showing a typical bimodal distribution with peaks at low angles (<15°) and high angles (>45°). This characteristic indicates that significant recovery and recrystallization occurred during thermomechanical processing, which contributes to optimized grain boundary structures and improved stress transfer and plastic compatibility. Figure 9e displays the grain size distribution, with statistical analysis indicating an average grain size of 18.537 μm. The grain size follows a normal distribution, suggesting that the majority of grains fall within a medium-size range and the microstructure is relatively uniform. Such uniform grain structure facilitates the simultaneous improvement of strength and ductility. Figure 9f shows the etched metallographic structure of the newly developed WS. Numerous martensite-austenite (M-A) islands are distributed within the ferritic matrix in irregular granular or blocky forms, exhibiting a “small island” morphology. This microstructure is typical of bainitic steels, which strengthens the matrix and inhibits crack propagation, thereby achieving a balance between strength and ductility.

Discussion

To comprehensively evaluate the overall performance of the newly developed WS, a comparative analysis was conducted against existing typical WS systems and relevant research results, thereby clarifying its relative standing and potential advantages within the current research landscape.

Corrosion resistance, strength, and ductility are the core indicators determining the suitability of WS for high-end applications. However, a pronounced trade-off relationship among these properties has long limited progress, making it difficult for traditional ‘trial-and-error’ methods to achieve balanced improvements across all performance metrics. To address this challenge, a comprehensive dataset incorporating corrosion resistance, strength, elongation, and alloying element characteristics was constructed based on extensive experimental data. Machine learning based feature selection methods were then employed to extract key physicochemical factors from numerous candidate variables, enabling effective dimensionality reduction of the high dimensional input space. Building upon this, economic cost was incorporated as an additional optimization target to achieve a comprehensive balance between performance and affordability. As a result, a newly developed WS composition was successfully designed that combines excellent corrosion resistance, high strength, and superior ductility at a relatively low cost, providing a novel strategy for the intelligent design of high-performance, cost-effective alloy materials.

Using the nine selected key physicochemical factors, along with grain size and testing conditions as input variables, multi objective performance models were constructed to predict UTS, elongation, and corrosion rate. Based on this framework, combined with the results of the attention weight analysis, the contribution relationships between the key factors and each element were established, enabling the construction of performance-composition mapping relationships and clarifying the major strengthening mechanisms and synergistic elemental pathways. Subsequently, by introducing economic constraints and setting performance-cost synergy as the optimization target, multiple rounds of iterative screening based on the utility function were conducted, ultimately leading to the successful design of a newly developed WS featuring high strength, high ductility, and excellent corrosion resistance.

Compared with conventional typical WSs such as Q345NH, the newly developed material demonstrates a nearly twofold reduction in corrosion rate while achieving a significant improvement in mechanical properties, highlighting a well-balanced performance profile and strong practical potential. As shown in Fig.10, the performance of this material (marked by a pentagram) was benchmarked against various existing steels across key performance indicators. In Fig.10a, the alloy achieves a UTS of 837 MPa while maintaining a low corrosion rate (approximately 0.54 g/m²·h), outperforming most carbon steels and WS. Figure 10b further incorporates elongation into the evaluation, showing that the material also maintains a favorable balance between ductility (approximately 20%) and corrosion resistance, positioning it at the edge of the performance advantage region among all compared materials.

Fig. 10: Comparison of the comprehensive properties of the newly developed WS with other steels.
figure 10

a Corrosion rate vs. UTS; b corrosion rate vs. elongation.

Compared with traditional post hoc interpretability tools such as SHAP, the attention mechanism adopted in this study exhibited greater adaptability and integration efficiency within our deep learning-based alloy optimization workflow. SHAP requires computing the marginal contribution of each feature for individual samples after model training, resulting in high computational costs, especially in multi-objective settings. In contrast, the attention mechanism, embedded within the model architecture, dynamically adjusts feature weights during training and can simultaneously capture local and global feature responses. While both approaches offer complementary advantages, attention is particularly well-suited for high-dimensional, multi-objective prediction tasks where interpretability and efficiency must be jointly addressed during model construction.

In summary, the attention mechanism-based composition screening and multi-objective optimization strategy proposed in this study not only successfully achieved the intelligent design of high-performance, low-cost weathering steel but also provides an interpretable and efficient pathway for alloy system optimization under complex multi-performance requirements.

This study combines feature selection with attention based multi objective optimization to propose an efficient and interpretable machine learning strategy for weathering steel design. The developed approach elucidates the intrinsic factors and explicit relationships by which alloying elements influence corrosion rate, tensile strength, and elongation. Based on these insights, a new weathering steel was successfully designed, achieving simultaneous enhancement of multiple performance metrics. Experimental validation confirmed that the newly developed alloy exhibits excellent comprehensive properties, with a corrosion rate of 0.54 g/(m2·h) under simulated polluted marine conditions, a UTS of 808 MPa, and an elongation of 20.3%. All measured properties significantly surpass those of existing weathering steels, while the overall material cost is substantially reduced. Mechanistic analysis of the key physicochemical characteristic factors revealed that elements with low atomic mass, high surface energy, and high standard electrode potential contribute to reducing corrosion rates. Elements with high compressive modulus, high enthalpy of fusion, and low boiling point are beneficial for improving UTS, whereas elements with a greater number of slip systems, high compressive modulus, and high Poisson’s ratio promote enhanced elongation.

Methods

To address the challenges posed by the limited experimental dataset for WS, a multi-objective feature optimization strategy was proposed, integrating traditional material descriptors with deep learning methods, as illustrated in Fig. 11. This strategy leverages the attention mechanism to optimize corrosion resistance, strength, and ductility while simultaneously reducing the cost of WS. The complete workflow consists of four main steps: data collection, key physicochemical characteristic factors selection, multi-objective optimization of alloy composition, and experimental characterization.

Fig. 11: Schematic diagram of the multi-objective feature optimization strategy for WS design based on an attention mechanism.
figure 11

The diagram illustrates a closed-loop workflow that leverages attention-based feature analysis to guide the design and validation of optimized alloy compositions.

Dataset construction

The corrosion performance dataset includes a total of 417 samples, with each sample containing information on alloy composition, grain size, corrosion environment, corrosion time, and weight loss corrosion rate. To ensure consistency and comparability, the weight loss rate after 576 hours of accelerated corrosion testing in a simulated polluted marine environment was uniformly selected as the corrosion performance indicator (unless otherwise specified, the unit is g/(m2·h)). This environment was chosen because polluted marine atmospheres, characterized by the simultaneous presence of corrosive species such as Cl⁻ and SO₂, represent one of the most severe service conditions for WS, posing critical challenges to the stability and protective ability of the rust layer51. The mechanical performance dataset contains 148 samples for tensile strength and 130 samples for elongation, with each sample including alloy composition, grain size, and corresponding mechanical properties. Details of corrosion performance dataset can be found in the National Materials Corrosion and Protection Scientific Data Center (https://www.corrdata.org.cn/). The tensile strength, and elongation data for WS were collected from publicly available literature sources provided in Supplementary Table 1. Elemental price data were obtained from Asian Metal (https://www.asianmetal.com/).

In addition, to simultaneously improve the corrosion resistance, tensile strength, and ductility of WS, it is essential to identify the intrinsic physical and chemical properties that govern these three performances. The dataset of elemental physical and chemical properties used in this study was compiled from the Materials Project52, Self-Diffusion and Impurity Diffusion in Pure Metals53, and related literature sources54,55,56. This dataset includes 66 physicochemical features, with detailed names and values provided in Supplementary Table 1.

Identification of key physicochemical factors

In this study, a multi-level feature selection framework was proposed to extract key performance-influencing factors from high-dimensional alloy data. First, a 132-dimensional feature space was systematically constructed based on the intrinsic correlations between alloy compositions and physicochemical properties. Each physicochemical property (such as electronegativity, atomic radius, and lattice energy) was represented using two descriptors. The first descriptor is the weighted average factor \({f}_{i}^{\text{m}}\), calculated as follows:

$$\begin{array}{c}{f}_{i}^{\text{m}}=\frac{\mathop{\sum }\nolimits_{j=1}^{15}\left({f}_{i}^{\text{j}}\times {\alpha }_{j}\right)}{\mathop{\sum }\nolimits_{j=1}^{15}{\alpha }_{j}}\end{array}$$
(1)

where αj represents the atomic percentage of element j, reflecting the contribution weight of each element to the average value of the property. The second descriptor, the variance factor, quantifies the influence of elemental concentration differences on the dispersion of the property distribution \({f}_{i}^{\text{v}}\) and is mathematically expressed as:

$$\begin{array}{c}{f}_{i}^{\text{v}}=\frac{\mathop{\sum }\nolimits_{j=1}^{15}\left[{\left({f}_{i}^{\text{j}}-{f}_{i}^{\text{m}}\right)}^{2}\times {\alpha }_{j}\right]}{\mathop{\sum }\nolimits_{j=1}^{15}{\alpha }_{j}}\end{array}$$
(2)

Each physicochemical descriptor corresponds to either the weighted average or variance (Eqs. 1, 2) of a specific elemental property. The contributions of individual elements to each descriptor are explicitly determined by their atomic fractions and intrinsic property values. A full mapping between descriptors and elemental properties is provided in Supplementary Table 1. The two-dimensional construction strategy accounts for both the synergistic effects among alloying elements and the compositional fluctuation effects, providing a foundational data matrix for subsequent analysis.

Subsequently, during the feature importance evaluation stage, an attention mechanism was introduced to construct performance prediction models. Through the self-attention layers, dynamic nonlinear mappings between physicochemical features and the three target properties (i.e., corrosion resistance, strength, and ductility) were captured. During model training, feature weight coefficients were normalized using an activation function to generate relative weight distributions within the range of 0 to 1. Based on a predefined significance threshold (experimentally determined as 0.02), features with weight values below the threshold were first eliminated. For each target property, the top 15 features ranked by attention weights were then selected to form an initial feature pool, effectively preserving the core influencing factors for each performance dimension.

To address potential information redundancy within the initial feature pool, a redundancy elimination mechanism based on statistical correlation was established. The Pearson correlation coefficient between each pair of features was calculated, which is mathematically defined as:

$$\begin{array}{c}\rho \left({f}_{a},{f}_{b}\right)=\frac{1}{n}\mathop{\sum }\limits_{i=1}^{n}\left(\frac{{f}_{a}^{\text{i}}-{f}_{a}^{\text{m}}}{{\sigma }_{a}}\right)\left(\frac{{f}_{b}^{\text{i}}-{f}_{b}^{\text{m}}}{{\sigma }_{b}}\right)\end{array}$$
(3)

where \({f}_{a}^{\text{i}}\) and \({f}_{b}^{\text{i}}\) represent the mean and standard deviation of feature vectors a and b, respectively. When the absolute value of the correlation coefficient between a feature pair exceeds 0.95, the features are considered highly linearly correlated. In such cases, a feature weight competition strategy is applied: the feature with the higher attention weight is retained as the primary feature, while the redundant feature with the lower weight is eliminated. For example, if the correlation coefficient between the weighted average factor of electronegativity (weight 0.12) and the variance factor of ionization energy (weight 0.08) reaches 0.97, the latter will be removed from the feature set. This strategy effectively reduces the impact of data collinearity on model stability.

During the optimal feature subset determination stage, an iterative cross-validation optimization framework was constructed. A five-fold cross-validation method was employed to evaluate the robustness of candidate feature combinations. In each iteration, a forward feature selection strategy was applied to progressively add feature variables, while simultaneously monitoring changes in the model’s coefficient of determination (R2). When the improvement in R2 caused by the addition of a new feature fell below 0.5% (at a statistical significance level of p < 0.05), the feature contribution was considered to have reached a saturation threshold, and the feature addition process was terminated. The final refined feature subsets contained 8 to 12 key physicochemical features depending on the specific target property, achieving a dimensionality reduction rate of 82% ~ 94% while maintaining a prediction accuracy of R2 > 0.88. This methodological framework, combining mechanism-driven and data-driven feature selection, significantly enhances the interpretability and engineering applicability of machine learning models.

Multi objective optimization of alloy compositions

To evaluate the influence of alloying elements under complex interactions, the attention mechanism was employed to interpret the models for corrosion resistance, tensile strength, and ductility. A comprehensive elemental importance evaluation method was established to meet the requirements of multi-objective performance optimization, thereby identifying candidate alloying elements. The schematic architecture for attention mechanism is shown in Fig. 12.

Fig. 12: Attention schematic for feature importance evaluation and multi-objective performance optimization.
figure 12

The attention mechanism computes feature weights through Query-Key-Value interactions, enabling interpretable and weighted aggregation for optimized performance prediction.

Considering the capacity of dataset, all experiments were conducted under a unified set of hyperparameters. The network architecture adopted a transformer, consisting of three Transformer layers, each with 3 attention heads38. During training, the Adam optimizer and mini-batch gradient descent was employed with a batch size of 16. The learning rate was set to 0.05. To mitigate overfitting, 30% dropout57,58 and L2 regularization59 was applied with a regularization coefficient of 0.1. The model was trained for 10000 epochs to assess its convergence under short-term training conditions. To ensure reproducibility of the results, the global random seed was fixed at 42, controlling for randomness in modules such as PyTorch. The training process is conducted on a deep learning server equipped with two NVIDIA 24GB RTX3090 GPUs.

Prior to constructing the performance prediction models, standardization of the original data for alloying elements across all key physicochemical features was performed to ensure consistent dimensions and scales. Specifically, let xi,j denote the original value of the ith element for the jth physicochemical factor, the standardized value \({z}_{{ij}}\) is calculated according to the following formula:

$${z}_{i,j}=\frac{{x}_{i,j}-{\mu }_{j}}{{\sigma }_{j}}$$
(4)

where \({\mu }_{j}\) and \({\sigma }_{j}\) represent the mean and standard deviation of the jth physicochemical factor across all elements, respectively. This standardization method effectively eliminates scale differences among different features and enables the model to focus on the relative variation trends between features.

After standardization of the alloying elements, the influence of each element on the physicochemical factors was further characterized. A feedforward neural network was employed to compute the attention scores for each alloying factor, which were then normalized using an activation function to obtain the final normalized weights. Let the input feature matrix be x = [x1, x2,…, xn], where xi represents the feature quantity of the ith alloying factor. The attention weights are calculated according to the following formula:

$${e}_{i}={V}^{T}\sigma \left(W{x}_{i}+b\right)$$
(5)

where W and V are trainable weight matrices, b is the bias term, and σ is the nonlinear activation function. Subsequently, the attention scores are normalized using the SoftMax function to obtain the final attention weights ai:

$${a}_{i}=\frac{\exp \left({e}_{i}\right)}{{\sum }_{j=1}^{n}\exp \left({e}_{j}\right)}$$
(6)

Finally, the overall contribution of the alloying factors is represented by a weighted summation, as follows:

$$y=\mathop{\sum }\limits_{i=1}^{n}{a}_{i}{x}_{i}$$
(7)

Based on the attention weights, response matrices for different elements were constructed and visualized using heatmaps to explore the synergistic interactions among elements across multiple performance metrics. A two-dimensional combination matrix was generated by pairing elements in all possible combinations and calculating the corresponding average performance scores. These results were then presented in the form of heatmaps, revealing the synergistic trends among alloying elements and their adaptability to the overall performance objectives.

After clarifying the potential synergistic mechanisms among elements, a performance-driven inverse design strategy was adopted to iteratively predict alloy compositions through multiple rounds. The standardized physicochemical feature and element association matrix was used as the initial input, and the model was employed to predict the performance of each possible element combination individually.

To achieve multi-objective prediction targeting key properties, the concept of balancing strength and ductility, commonly expressed as the product of strength and ductility31, was referenced to evaluate the overall alloy performance. Accordingly, a multi-objective utility function, as shown in Eq. 8, was constructed and used as the optimization objective for both performance and cost.

$${\rm{CMEI}}=\frac{{I}_{{\rm{strenght}}}\times {I}_{{\rm{plasticity}}}}{{I}_{{\rm{corrosion}}}\times {I}_{{\rm{economy}}}}$$
(8)

Here, Icorrosion, Istrength and Iplasticity represent the normalized indicators corresponding to corrosion resistance (based on weight loss rate), tensile strength (based on reference tensile strength), and ductility (based on reference elongation), respectively. Ieconomy denotes the predicted cost index of the alloy composition, calculated from the weighted market prices of constituent elements. All indicators were rescaled to the range of [0, 1] using min-max normalization across the entire dataset.

The predicted property values were substituted into the CMEI utility function to obtain a composite score, which was subsequently used as the objective metric for iterative optimization. A higher CMEI value corresponds to a more favorable overall performance of the alloy in terms of strength, ductility, corrosion resistance, and economic feasibility.

In each iteration, the model adjusted the elemental concentrations based on the utility value calculated from the current composition, generating new candidate combinations for the next round of performance prediction and utility evaluation. The iterative process was terminated once the utility function converged, resulting in an optimized set of alloy compositions that balanced corrosion resistance and mechanical properties.

Experimental verification

To ensure that the resulting microstructure was consistent with that of the dataset, the alloy preparation process used in this study is illustrated in Fig. 13. Circular billet samples were first fabricated using a vacuum melting furnace. After heating to 1200 °C, the billets underwent homogenization treatment for 2 h, followed by furnace cooling to 1000 °C and holding for another 2 h. After surface descaling, the billets were hot-rolled, starting at approximately 1000 °C and finishing at 880 °C, through repeated rolling until a final thickness of 12 mm was achieved. The steel plates were then water-cooled to 420 ~ 440 °C and subsequently air-cooled to room temperature to obtain a microstructure primarily composed of bainite with an average grain size of approximately 20 μm60.

Fig. 13: Processing flowchart for WS preparation.
figure 13

The processing route includes homogenization, precipitation treatment, hot rolling, and subsequent quenching and cooling steps, enabling microstructural refinement and property tailoring.

The corrosion performance was evaluated based on the weight loss rate of alloy samples after undergoing wet-dry cyclic corrosion testing. The test medium was a 0.01 M NaHSO3 solution. Each cycle lasted 60 minutes, consisting of 12 minutes of immersion in the solution followed by 48 minutes of drying in air. Both immersion and drying were conducted at 45 °C. The total test durations were set at 72 h, 144 h, 288 h, and 576 h, with the solution renewed every 24 h. For each test duration, three parallel alloy samples with dimensions of 50 mm × 25 mm × 2 mm were used.

According to previously established models correlating accelerated laboratory tests with natural atmospheric exposure61, the corrosion extent under these conditions is equivalent to approximately 696 days of natural exposure in a polluted marine environment in Qingdao, China.

Before testing, all samples were ground with sandpaper up to 800 grits, followed by ultrasonic cleaning. After drying for 24 h, the initial mass and surface area of each sample were measured. After the corrosion test, samples were retrieved from the solution and subjected to ultrasonic rust removal. The rust removal solution consisted of 500 mL H2O, 500 mL HCl, and 3.5 g hexamethylenetetramine. After descaling, the samples were again ultrasonically cleaned and weighed. The corrosion rate was calculated according to Eq. 9.

$$v=\frac{{m}_{0}-{m}_{t}}{S\cdot t}$$
(9)

Here, \({m}_{0}\) is the mass of the specimen before corrosion (g), \({m}_{t}\) is the mass of the specimen after corrosion with corrosion products removed (g), S is the exposed surface area of the specimen (m2), and t is the corrosion time (h).

Tensile tests were conducted following the GB/T 228.1-2010 standard, using dog-bone-shaped specimens prepared along the rolling direction with a gauge length of 20 mm. The tests were performed at room temperature on an MTS 810 universal testing machine with a deformation rate of 1.0 mm/min. For each experimental condition, three samples were tested, and the tensile strength and elongation were reported as the average of the three measurements.

Electron backscatter diffraction (EBSD) characterization was performed using a scanning electron microscope (SEM, JEOL-7001F) equipped with an Oxford HKL detector. The accelerating voltage was set to 20 kV, and the scanning step size was 90 nm. For metallographic preparation, the samples were etched with a 4% nitric acid alcohol solution at room temperature for 10 ~ 15 seconds, followed by immediate rinsing with alcohol and drying with compressed air.