Interpreting artificial neural network-based modeling of 4 H-SiC mosfets using explainable AI

Hsiao, Yu-Sheng; Chang, Pei-Jie; Chen, Bang-Ren; Rai, Rushat; Singh, Shivendra Kumar; Chauhan, Yogesh Singh; Lee, Wen-Jay; Chen, Nan-Yow; Wu, Tian-Li

doi:10.1038/s41598-026-35179-0

Download PDF

Article
Open access
Published: 15 January 2026

Interpreting artificial neural network-based modeling of 4 H-SiC mosfets using explainable AI

Yu-Sheng Hsiao¹^na1,
Pei-Jie Chang²^na1,
Bang-Ren Chen³,
Rushat Rai³,
Shivendra Kumar Singh^3,5,
Yogesh Singh Chauhan⁵,
Wen-Jay Lee⁶,
Nan-Yow Chen⁶ &
…
Tian-Li Wu^1,2,3,4

Scientific Reports volume 16, Article number: 5297 (2026) Cite this article

2118 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Wide bandgap (WBG) semiconductors such as 4 H-SiC MOSFETs are key enablers for next-generation power electronics due to their superior efficiency and high-temperature capability. However, their electrical performance is strongly affected by process variations, and conventional Technology Computer-Aided Design (TCAD) simulations are computationally demanding and difficult to scale. This work presents a novel explainable machine learning framework that integrates artificial neural networks (ANNs) with explainable artificial intelligence (XAI) to enable accurate and interpretable device modeling. The ANN is trained on comprehensive TCAD-generated datasets covering a broad range of structural and doping parameters, while SHapley Additive exPlanations (SHAP) are employed to quantify the influence of each design parameter on electrical characteristics. The proposed model achieves a Pearson correlation coefficient exceeding 0.99 for on-state current prediction, and SHAP analysis reveals physically consistent trends such as the inverse dependence of drain current on oxide thickness and channel length. This study is among the first to apply XAI for interpreting design–performance correlations in SiC MOSFETs, establishing a transparent and data-driven framework for device understanding and optimization. The methodology can be readily extended to other semiconductor technologies where both modeling accuracy and interpretability are essential.

Introduction

The emergence of wide bandgap (WBG) semiconductors, particularly 4 H-SiC MOSFETs, has significantly advanced the performance of high-voltage and high-frequency power electronics^1,2. These devices offer substantial benefits in terms of thermal stability, breakdown voltage, and switching efficiency, making them key enablers for electric vehicles, renewable energy systems, and industrial motor drives³. However, their design optimization remains challenging due to the complex physical interactions involved and the high cost of epitaxial growth and fabrication.

Traditionally, Technology Computer-Aided Design (TCAD) simulations are employed to model the influence of process and structural parameters on device behavior. While accurate, TCAD is computationally intensive and time-consuming, particularly for large-scale design exploration. Recent advances in machine learning (ML), particularly artificial neural networks (ANNs)^4,5,6,7, have demonstrated potential for rapid device modeling by learning complex input–output relationships from data. However, a critical limitation of ML-based approaches lies in their lack of interpretability—the so-called “black box” problem—which hinders their acceptance in physics-driven device engineering.

To address this, explainable artificial intelligence (XAI) methods such as SHapley Additive exPlanations (SHAP) have emerged, offering tools to interpret ML models by quantifying feature contributions. While XAI has gained traction in image and time-series domains⁸, its application to power semiconductor device modeling remains largely unexplored.

Artificial neural networks (ANNs) have emerged as powerful tools for modeling complex, highly nonlinear, and multi-dimensional relationships between input parameters and target outputs. Their flexibility and scalability make them particularly well suited for scientific and engineering problems where analytical modeling is challenging or computationally expensive. In semiconductor device research, ANNs have been successfully applied to a wide range of tasks, including compact modeling of advanced transistors⁹, yield prediction in integrated circuit manufacturing¹⁰, and optimization of fabrication processes¹¹.

In this work, a novel approach to interpret the design impacts on the electrical characteristics of 4 H-SiC MOSFETs by integrating artificial neural networks (ANNs) with explainable artificial intelligence (XAI) techniques is demonstrated. Using a comprehensive TCAD-generated dataset, we employ an ANN to model the full ID-VG characteristics in 4 H-SiC MOSFETs with different design parameters, including oxide thickness (Tox), channel length (Lch), p-well concentration (Pconc), n-drift concentration (ndrift_conc), and N + substrate concentration (nsub_conc). To understand the impact of each device design decision on the full ID-VG characteristics and Ion (ID extracted at VG = 10 V), we employ an XAI approach, SHAP (SHapley Additive exPlanations). The results reveal consistent correlations between SHAP values and physical phenomena from the design impacts, such as effects of the channel length and oxide thickness, validating XAI as a viable tool for the analysis of the device designs on the electrical characteristics, which is promising for the ML-based optimization in the semiconductor technologies.

Device schematic and typical characteristic

The schematic of the planar SiC MOSFETs is shown in Fig. 1a. The ID-VG for the data set of this study is generated by TCAD Sentaurus (Synopsys Co.) considering the varied designs of oxide thickness (Tox), channel length (Lch), p-well concentration (Pconc), n-drift concentration (ndrift_conc), and N + substrate concentration (nsub_conc). In TCAD framework, all device structures are constructed using Sentaurus Device Editor (SDE), where the dopant distributions are defined directly within the simulation domain. This approach ensures that the implanted regions follow spatially uniform or smoothly graded concentration profiles as prescribed in the deck, rather than relying on empirical fitting. Figure 1b presents the corresponding TCAD-simulated doping concentration distribution used in this work.

All simulations employ a calibrated set of physical models. The carrier mobility is modeled using both the High-Field Saturation model and the Lombardi Enormal surface mobility formulation, capturing the effects of surface scattering and high-field transport. Dopant activation is treated with the Incomplete Ionization model through the PhosphorusActiveConcentration Split option, allowing the active dopant concentration to reflect realistic activation levels instead of assuming full ionization. These physics models are consistently applied throughout the entire set of simulations, ensuring that the extracted electrical behaviors and parametric trends are grounded in physically realistic conditions.

Detailed splits of the design parameters are shown in Table 1. Therefore, the process conditions are combined to generate a total of 3000 I_D–V_G curves for the training (2400 I_D–V_G curves) and verification (600 I_D–V_G curves) of the developed ANN approach with the analysis from XAI. Figure 2 shows an example of a typical I_D–V_G characteristic.

Table 1 Various designs for TCAD forming the dataset in the work.

Full size table

Methodology

As shown in Fig. 3, an ANN operates as a “black box”¹², linking input variables to produce results through explicit nonlinear functions. ANNs generally consist of input, hidden, and output layers. In our model, the input layer consists of 6 parameters: 5 processing parameters (p-well concentration, N⁺ substrate concentration, N⁻ drift concentration, T_ox and L_ch) and V_G. The output is the drain current (I_D) corresponding to the specific process and gate bias (V_G). The hidden layer includes 5 layers, followed by a final output layer. ReLU is used in the hidden layer because it can help to mitigate the vanishing gradient problem and allows the model to learn complex patterns effectively¹³. The output layer uses a linear activation function since it is suitable for regression tasks.

To understand how AI makes decisions, XAI aims to make the “black box” transparent. Therefore, using XAI can help analyze the impacts of design parameters on the I_D characteristics. The explainable AI method used in this work is SHAP (SHapley Additive exPlanations). SHAP uses Shapley values from game theory to identify the contribution of each feature to the prediction^14,15,16.

The training phase of our ANN model employs the backpropagation algorithm combined with the Adam optimizer. Backpropagation is a widely used method in neural networks for calculating the gradient of the loss function with respect to each weight by the chain rule, allowing for the efficient computation of these gradients across multiple layers. During each iteration of training, the backpropagation algorithm calculates the gradients of the Mean Squared Error (MSE) loss function with respect to the model’s weight. The MSE loss function measures the average squared difference between the predicted values and the actual values, serving as a key indicator of the model’s performance. This iterative process continues until the MSE loss is minimized to an acceptable level, ensuring that the model has learned to accurately predict the current corresponding to the specific voltage V_G for the given process parameters. This approach effectively establishes complex nonlinear relationships between inputs and outputs, which allows the model to capture and represent intricate patterns and dependencies within the data, thereby improving its predictive accuracy and robustness.

After completing the training of ANN model, we use SHAP to establish an explanation interface to understand the basis of the model’s learning. Figure 4 shows the flow chart for developing ANN characteristics prediction and building up the explainable interface.

Results and discussions

Figure 5a illustrates the training epoch iteration of the developed ANN model for explainable AI, showing that the loss value gradually decreases to 2 × 10^− 6 over 4000 epoch. Furthermore, from 4000 epoch to 10,000 epoch, the loss value shows minimal change, which implies that the training is converging, and further training has a negligible effect on improving performance. In addition, Fig. 5b shows the modeling results for Ion (I_D extracted at V_G=10 V) by the ANN model comparing with the simulation results. High Pearson’s correlation coefficient (R² = 0.99) can be obtained, indicating the successful development of ANN model for accurate I_on modeling.

Figure 6 shows global interpretability (impact on full I_D curve characteristics) for the ANN model by using SHAP, indicating the distribution of SHAP values for each feature and their corresponding influences. SHAP typically uses the median of feature values as the baseline reference, enabling it to capture both positive and negative contributions of each feature. In Fig. 6, the x-axis represents the corresponding influence based on the SHAP values while the y-axis represents the input features. Each dot represents the individual sample in the testing dataset, and the color of each dot indicates the value of the specific feature, with the color ranging from blue to red to represent values from low to high. For example, the red dot of channel length indicates the longer channel, and the blue dot indicates the short channel, as shown in the Table 2.

Figure 6 indicates that a negative SHAP value can be experienced when L_ch, T_ox and P_conc are increased. On the other hand, when L_ch, T_ox, and P_conc decrease, a positive SHAP value can be obtained. Notably, the SHAP values associated with n_{sub_conc} and n_{drift_conc} remain close to zero in global interpretability analysis. This indicates that, under the given V_G bias condition (10 V), these parameters have negligible influence on I_D. This observation is consistent with the device physics: the current in the MOS channel is more sensitive to L_ch, T_ox and P_conc, while n_{sub_conc} and n_{drift_conc} primarily affect high voltage blocking characteristics. Considering the clear positive/negative SHAP value to understand the influence of each feature on the I_D, SHAP analysis is the effective promising approach to achieve the XAI for global interpretability.

Table 2 Indication of the Dot color of feature value in Fig. 6.

Full size table

Figure 7 shows the SHAP analysis of local interpretability for the ANN model to indicate the impacts of the process design parameters (T_ox = 0.01/0.03/0.05 μm, L_ch = 2/4/6 um, P_conc = 1 × 10¹⁶/3 × 10¹⁶/5 × 10¹⁶, n_{drift_conc} = 2 × 10¹⁵/2 × 10¹⁵/2 × 10¹⁶, and n_{sub_conc} = 1 × 10¹⁸/3 × 10¹⁸/5 × 10¹⁸), on normalized I_D. Considering the SHAP analysis, normalized I_D is decreased as the increase of L_ch, T_ox, and P_conc, which have the same trend of decreasing of SHAP value as the increase of L_ch, T_ox, and P_conc (Fig. 7a–e). Furthermore, normalized I_D (I_D extracted at V_G=10 V) is not influenced by the n_{sub_conc} and n_{drift_conc}, which have the same trend of the SHAP value. Therefore, the SHAP analysis can successfully analyze the impacts of the processing parameters on the normalized I_D. Table 3 summarizes the global interpretability using the SHAP mean values. The SHAP mean value represents the average absolute SHAP contribution of each feature across all samples, indicating the overall importance of process parameters regardless of positive or negative effects. This analysis successfully captures the global impacts of the process designs on the full ID–VG characteristics, demonstrating that SHAP is suitable for explainable AI applications to understand how process variations influence the ID behavior in SiC MOSFETs.

Figure 8 shows the SHAP values (Y-axis) plotted against the normalized ID (X-axis). By calculating the coefficient of determination (R²), it can be observed that Tox, Lch, and Pconc exhibit strong correlations with the device behavior, whereas ndrift_conc and nsub_conc show negligible correlations.

Table 3 Summary of global interpretability using SHAP.

Full size table

As summarized in Table 4, previous studies have made significant progress in integrating machine learning (ML) with TCAD simulations or accelerating device modeling and reliability analysis through data-driven approaches. While these works successfully demonstrated how ML can enhance the efficiency and predictive capability of TCAD-based simulations, they primarily focused on improving performance metrics or computational speed.

However, none of them explicitly interpret or quantify the contribution of individual device or process parameters to the resulting electrical characteristics. In contrast, our work not only employs ML to assist TCAD modeling but also provides a detailed interpretability analysis, allowing us to identify the relative influence of each physical parameter on the output behavior.

Table 4 Overview of ML applications in TCAD-based device modeling.

Full size table

Conclusion

In this work, we successfully demonstrate to use XAI techniques to understand design impacts on I_D characteristics using an ANN approach for the first time. First, we show that our ANN model achieves high accuracy with Pearson’s correlation coefficients for Ion prediction and exceeding 0.99, validating its reliability for predicting electrical properties. Second, the SHAP analysis is conducted to evaluate the impacts of the processing designs on the full I_D-V_G (global interpretability) and normalized I_D characterization (local interpretability). For global interpretability, the analysis of SHAP offers good precision, making SHAP a more effective tool for understanding model predictions. For local interpretability, the analysis confirmed that the SHAP values change consistently with the normalized on-state I_D across different process parameters. By employing SHAP, we provide both global and local interpretability, revealing consistent correlations between key design parameters—such as channel length (L_ch), oxide thickness (T_ox), and p-well concentration (P_conc)—and the drain current. The SHAP results align with physical device principles, validating the reliability of the approach. Importantly, the integration of calibrated TCAD physics (including incomplete ionization and advanced mobility models) with SHAP analysis ensures that the interpretability results are grounded in realistic process conditions. Unlike conventional TCAD, which reveals parameter influence only through controlled variations, SHAP provides a quantitative decomposition of feature contributions across the dataset, enabling transparent ranking of design parameters and confirming consistency with device physics. In summary, the demonstrated results indicate the SHAP analysis using in XAI is promising to understand the impacts of the process designs in SiC MOSFETs, which can be further extended to wide range of the optimization in SiC MOSFETs.

Data availability

The datasets generated and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Armstrong, K. O. et al. Wide bandgap semiconductor opportunities in power electronics. In Proc. IEEE 4th Workshop on Wide Bandgap Power Devices and Applications (WiPDA), 259–264 (2016).
Middelstaedt, L. et al. Strategy for reducing oscillations in power electronic circuits using gate control. In Proc. PCIM Europe : International Exhibition and Conference for Power Electronics, Intelligent Motion, Renewable Energy and Energy Management, 1–7 (2018).
Tu, C. C. et al. Industry perspective on power electronics for electric vehicles. Nat. Rev. Electr. Eng. 1 (6), 435–452 (2024).
Article Google Scholar
Jeong, C. et al. Bridging TCAD and AI: Its application to semiconductor design. IEEE Trans. Electron Devices. 68(11), 5364–5371. https://doi.org/10.1109/TED.2021.3093844 (2021).
Thomann, S. et al. ML-TCAD: Accelerating FeFET reliability analysis using machine learning. IEEE Trans. Electron Devices. 71(1), 213–222. https://doi.org/10.1109/TED.2023.3336305 (2024).
Lee, W. J., Hsieh, W. T., Fang, B. H., Kao, K. H. & Chen, N. Y. Device simulations with a U-Net model predicting physical quantities in two-dimensional landscapes. Sci. Rep. 13 (1), 731 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Kutub, S. B. et al. Artificial neural network-based (ANN) approach for characteristics modeling and prediction in GaN-on-Si power devices. In 2020 32nd International Symposium on Power Semiconductor Devices and ICs (ISPSD), Vienna, Austria, 529–532. https://doi.org/10.1109/ISPSD46842.2020.9170110 (2020).
Theissler, A. et al. Explainable AI for time series classification: A Review, taxonomy and research directions. IEEE Access. 10, 100700–100724 (2022).
Article Google Scholar
Wang, J. et al. Artificial neural network-based compact modeling methodology for advanced transistors. IEEE Trans. Electron Devices. 68(3), 1318–1325. https://doi.org/10.1109/TED.2021.3051137 (2021).
Sage, J. P., Thompson, K. & Withers, R. S. An artificial neural network integrated circuit based on MNOS/CCD principles. In AIP Conference Proceedings, vol. 151, no. 1, 381–385. https://doi.org/10.1063/1.36243 (1986).
De Filippis, L. A. C. et al. ANN modelling to optimize manufacturing process. Adv. Appl. Artif. Neural Netw. 201–226 (2018).
Benitez, J. M. et al. Are artificial neural networks black boxes? IEEE Trans. Neural Netw. 8(5), 1156–1164 (1997).
Glorot, X. et al. Deep sparse rectifier neural networks. In Proc. 14th International Conference on Artificial Intelligence and Statistics (2011).
Lundberg, S. M. et al. A unified approach to interpreting model predictions. In Proc. Neural Information Processing Systems, vol. 30, 4765–4774 (2017).
Friedman, J. H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29 (5), 1189–1232 (2001).
Article MathSciNet Google Scholar
Vilone, G. et al. Classification of explainable artificial intelligence methods through their output formats. Mach. Learn. Knowl. Extr. 3, 615–661 (2021).
Article Google Scholar

Download references

Funding

This work was supported in part by the “Advanced Semiconductor Technology Research Center” from the Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan; in part by the National Science and Technology Council (NSTC), Taiwan, Grant 114-2223-E-A49-003-MY4, 114-2221-E-492 -002 -MY2 and 113-2221-E-492-010.

Author information

Yu-Sheng Hsiao and Pei-Jie Chang contributed equally to this work.

Authors and Affiliations

Institute of Pioneer Semiconductor Innovation, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
Yu-Sheng Hsiao & Tian-Li Wu
Institute of Electronics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
Pei-Jie Chang & Tian-Li Wu
International College of Semiconductor Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
Bang-Ren Chen, Rushat Rai, Shivendra Kumar Singh & Tian-Li Wu
Department of Electronics and Electrical Engineering, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
Tian-Li Wu
Department of Electrical Engineering, Indian Institute of Technology Kanpur, Kanpur, India
Shivendra Kumar Singh & Yogesh Singh Chauhan
National Center for High-Performance Computing, Hsinchu, Taiwan
Wen-Jay Lee & Nan-Yow Chen

Authors

Yu-Sheng Hsiao
View author publications
Search author on:PubMed Google Scholar
Pei-Jie Chang
View author publications
Search author on:PubMed Google Scholar
Bang-Ren Chen
View author publications
Search author on:PubMed Google Scholar
Rushat Rai
View author publications
Search author on:PubMed Google Scholar
Shivendra Kumar Singh
View author publications
Search author on:PubMed Google Scholar
Yogesh Singh Chauhan
View author publications
Search author on:PubMed Google Scholar
Wen-Jay Lee
View author publications
Search author on:PubMed Google Scholar
Nan-Yow Chen
View author publications
Search author on:PubMed Google Scholar
Tian-Li Wu
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.-S.H. and P.-J.C. contributed equally to this work. Y.-S.H. and P.-J.C. proposed the method, collected the data and drafted and reviewed the manuscript. B.-R.C. collected the data. R.R. reviewed the manuscript. S.K.S. reviewed the manuscript. Y.S.C. reviewed the manuscript. W.-J.L. proposed the method, reviewed the manuscript. N.-Y.C. proposed the method, reviewed the manuscript. T.-L.W. proposed the method, reviewed the manuscript, supervised the project .

Corresponding authors

Correspondence to Wen-Jay Lee, Nan-Yow Chen or Tian-Li Wu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Hsiao, YS., Chang, PJ., Chen, BR. et al. Interpreting artificial neural network-based modeling of 4 H-SiC mosfets using explainable AI. Sci Rep 16, 5297 (2026). https://doi.org/10.1038/s41598-026-35179-0

Download citation

Received: 22 June 2025
Accepted: 02 January 2026
Published: 15 January 2026
Version of record: 06 February 2026
DOI: https://doi.org/10.1038/s41598-026-35179-0