Introduction

The emergence of wide bandgap (WBG) semiconductors, particularly 4 H-SiC MOSFETs, has significantly advanced the performance of high-voltage and high-frequency power electronics1,2. These devices offer substantial benefits in terms of thermal stability, breakdown voltage, and switching efficiency, making them key enablers for electric vehicles, renewable energy systems, and industrial motor drives3. However, their design optimization remains challenging due to the complex physical interactions involved and the high cost of epitaxial growth and fabrication.

Traditionally, Technology Computer-Aided Design (TCAD) simulations are employed to model the influence of process and structural parameters on device behavior. While accurate, TCAD is computationally intensive and time-consuming, particularly for large-scale design exploration. Recent advances in machine learning (ML), particularly artificial neural networks (ANNs)4,5,6,7, have demonstrated potential for rapid device modeling by learning complex input–output relationships from data. However, a critical limitation of ML-based approaches lies in their lack of interpretability—the so-called “black box” problem—which hinders their acceptance in physics-driven device engineering.

To address this, explainable artificial intelligence (XAI) methods such as SHapley Additive exPlanations (SHAP) have emerged, offering tools to interpret ML models by quantifying feature contributions. While XAI has gained traction in image and time-series domains8, its application to power semiconductor device modeling remains largely unexplored.

Artificial neural networks (ANNs) have emerged as powerful tools for modeling complex, highly nonlinear, and multi-dimensional relationships between input parameters and target outputs. Their flexibility and scalability make them particularly well suited for scientific and engineering problems where analytical modeling is challenging or computationally expensive. In semiconductor device research, ANNs have been successfully applied to a wide range of tasks, including compact modeling of advanced transistors9, yield prediction in integrated circuit manufacturing10, and optimization of fabrication processes11.

In this work, a novel approach to interpret the design impacts on the electrical characteristics of 4 H-SiC MOSFETs by integrating artificial neural networks (ANNs) with explainable artificial intelligence (XAI) techniques is demonstrated. Using a comprehensive TCAD-generated dataset, we employ an ANN to model the full ID-VG characteristics in 4 H-SiC MOSFETs with different design parameters, including oxide thickness (Tox), channel length (Lch), p-well concentration (Pconc), n-drift concentration (ndrift_conc), and N + substrate concentration (nsub_conc). To understand the impact of each device design decision on the full ID-VG characteristics and Ion (ID extracted at VG = 10 V), we employ an XAI approach, SHAP (SHapley Additive exPlanations). The results reveal consistent correlations between SHAP values and physical phenomena from the design impacts, such as effects of the channel length and oxide thickness, validating XAI as a viable tool for the analysis of the device designs on the electrical characteristics, which is promising for the ML-based optimization in the semiconductor technologies.

Device schematic and typical characteristic

The schematic of the planar SiC MOSFETs is shown in Fig. 1a. The ID-VG for the data set of this study is generated by TCAD Sentaurus (Synopsys Co.) considering the varied designs of oxide thickness (Tox), channel length (Lch), p-well concentration (Pconc), n-drift concentration (ndrift_conc), and N + substrate concentration (nsub_conc). In TCAD framework, all device structures are constructed using Sentaurus Device Editor (SDE), where the dopant distributions are defined directly within the simulation domain. This approach ensures that the implanted regions follow spatially uniform or smoothly graded concentration profiles as prescribed in the deck, rather than relying on empirical fitting. Figure 1b presents the corresponding TCAD-simulated doping concentration distribution used in this work.

All simulations employ a calibrated set of physical models. The carrier mobility is modeled using both the High-Field Saturation model and the Lombardi Enormal surface mobility formulation, capturing the effects of surface scattering and high-field transport. Dopant activation is treated with the Incomplete Ionization model through the PhosphorusActiveConcentration Split option, allowing the active dopant concentration to reflect realistic activation levels instead of assuming full ionization. These physics models are consistently applied throughout the entire set of simulations, ensuring that the extracted electrical behaviors and parametric trends are grounded in physically realistic conditions.

Fig. 1
Fig. 1
Full size image

(a) TCAD-simulated doping concentration distribution of the device structure. (b) Schematic of lateral SiC MOSFETs in this work.

Detailed splits of the design parameters are shown in Table 1. Therefore, the process conditions are combined to generate a total of 3000 ID–VG curves for the training (2400 ID–VG curves) and verification (600 ID–VG curves) of the developed ANN approach with the analysis from XAI. Figure 2 shows an example of a typical ID–VG characteristic.

Fig. 2
Fig. 2
Full size image

An example of typical ID–VG characteristics of SiC MOSFETs in this work (Pconc = 1 × 1016 cm-3, nsub_conc = 1 × 1018 cm−3, ndrift_conc = 2 × 1015 cm-3, Tox = 0.01 to 0.05 μm, Lch = 2 μm).

Table 1 Various designs for TCAD forming the dataset in the work.

Methodology

As shown in Fig. 3, an ANN operates as a “black box”12, linking input variables to produce results through explicit nonlinear functions. ANNs generally consist of input, hidden, and output layers. In our model, the input layer consists of 6 parameters: 5 processing parameters (p-well concentration, N+ substrate concentration, N drift concentration, Tox and Lch) and VG. The output is the drain current (ID) corresponding to the specific process and gate bias (VG). The hidden layer includes 5 layers, followed by a final output layer. ReLU is used in the hidden layer because it can help to mitigate the vanishing gradient problem and allows the model to learn complex patterns effectively13. The output layer uses a linear activation function since it is suitable for regression tasks.

To understand how AI makes decisions, XAI aims to make the “black box” transparent. Therefore, using XAI can help analyze the impacts of design parameters on the ID characteristics. The explainable AI method used in this work is SHAP (SHapley Additive exPlanations). SHAP uses Shapley values from game theory to identify the contribution of each feature to the prediction14,15,16.

Fig. 3
Fig. 3
Full size image

ANN to model the relationship between various inputs and output characteristics.

The training phase of our ANN model employs the backpropagation algorithm combined with the Adam optimizer. Backpropagation is a widely used method in neural networks for calculating the gradient of the loss function with respect to each weight by the chain rule, allowing for the efficient computation of these gradients across multiple layers. During each iteration of training, the backpropagation algorithm calculates the gradients of the Mean Squared Error (MSE) loss function with respect to the model’s weight. The MSE loss function measures the average squared difference between the predicted values and the actual values, serving as a key indicator of the model’s performance. This iterative process continues until the MSE loss is minimized to an acceptable level, ensuring that the model has learned to accurately predict the current corresponding to the specific voltage VG for the given process parameters. This approach effectively establishes complex nonlinear relationships between inputs and outputs, which allows the model to capture and represent intricate patterns and dependencies within the data, thereby improving its predictive accuracy and robustness.

Fig. 4
Fig. 4
Full size image

Flowchart for ANN modeling and explainable AI.

After completing the training of ANN model, we use SHAP to establish an explanation interface to understand the basis of the model’s learning. Figure 4 shows the flow chart for developing ANN characteristics prediction and building up the explainable interface.

Results and discussions

Figure 5a illustrates the training epoch iteration of the developed ANN model for explainable AI, showing that the loss value gradually decreases to 2 × 10− 6 over 4000 epoch. Furthermore, from 4000 epoch to 10,000 epoch, the loss value shows minimal change, which implies that the training is converging, and further training has a negligible effect on improving performance. In addition, Fig. 5b shows the modeling results for Ion (ID extracted at VG=10 V) by the ANN model comparing with the simulation results. High Pearson’s correlation coefficient (R2 = 0.99) can be obtained, indicating the successful development of ANN model for accurate Ion modeling.

Fig. 5
Fig. 5
Full size image

Training iteration epoch versus MSE loss in the developed ANN model in this work (a) and the predicted results and simulation results for Ion (ID extracted at VG=10 V) by the ANN model (b).

Figure 6 shows global interpretability (impact on full ID curve characteristics) for the ANN model by using SHAP, indicating the distribution of SHAP values for each feature and their corresponding influences. SHAP typically uses the median of feature values as the baseline reference, enabling it to capture both positive and negative contributions of each feature. In Fig. 6, the x-axis represents the corresponding influence based on the SHAP values while the y-axis represents the input features. Each dot represents the individual sample in the testing dataset, and the color of each dot indicates the value of the specific feature, with the color ranging from blue to red to represent values from low to high. For example, the red dot of channel length indicates the longer channel, and the blue dot indicates the short channel, as shown in the Table 2.

Fig. 6
Fig. 6
Full size image

Global interpretability for the ANN model by using SHAP.

Figure 6 indicates that a negative SHAP value can be experienced when Lch, Tox and Pconc are increased. On the other hand, when Lch, Tox, and Pconc decrease, a positive SHAP value can be obtained. Notably, the SHAP values associated with nsub_conc and ndrift_conc remain close to zero in global interpretability analysis. This indicates that, under the given VG bias condition (10 V), these parameters have negligible influence on ID. This observation is consistent with the device physics: the current in the MOS channel is more sensitive to Lch, Tox and Pconc, while nsub_conc and ndrift_conc primarily affect high voltage blocking characteristics. Considering the clear positive/negative SHAP value to understand the influence of each feature on the ID, SHAP analysis is the effective promising approach to achieve the XAI for global interpretability.

Table 2 Indication of the Dot color of feature value in Fig. 6.
Fig. 7
Fig. 7
Full size image

Local interpretability for the ANN model.

Figure 7 shows the SHAP analysis of local interpretability for the ANN model to indicate the impacts of the process design parameters (Tox = 0.01/0.03/0.05 μm, Lch = 2/4/6 um, Pconc = 1 × 1016/3 × 1016/5 × 1016, ndrift_conc = 2 × 1015/2 × 1015/2 × 1016, and nsub_conc = 1 × 1018/3 × 1018/5 × 1018), on normalized ID. Considering the SHAP analysis, normalized ID is decreased as the increase of Lch, Tox, and Pconc, which have the same trend of decreasing of SHAP value as the increase of Lch, Tox, and Pconc (Fig. 7a–e). Furthermore, normalized ID (ID extracted at VG=10 V) is not influenced by the nsub_conc and ndrift_conc, which have the same trend of the SHAP value. Therefore, the SHAP analysis can successfully analyze the impacts of the processing parameters on the normalized ID. Table 3 summarizes the global interpretability using the SHAP mean values. The SHAP mean value represents the average absolute SHAP contribution of each feature across all samples, indicating the overall importance of process parameters regardless of positive or negative effects. This analysis successfully captures the global impacts of the process designs on the full ID–VG characteristics, demonstrating that SHAP is suitable for explainable AI applications to understand how process variations influence the ID behavior in SiC MOSFETs.

Figure 8 shows the SHAP values (Y-axis) plotted against the normalized ID (X-axis). By calculating the coefficient of determination (R²), it can be observed that Tox, Lch, and Pconc exhibit strong correlations with the device behavior, whereas ndrift_conc and nsub_conc show negligible correlations.

Table 3 Summary of global interpretability using SHAP.
Fig. 8
Fig. 8
Full size image

Local interpretability for the ANN model.

As summarized in Table 4, previous studies have made significant progress in integrating machine learning (ML) with TCAD simulations or accelerating device modeling and reliability analysis through data-driven approaches. While these works successfully demonstrated how ML can enhance the efficiency and predictive capability of TCAD-based simulations, they primarily focused on improving performance metrics or computational speed.

However, none of them explicitly interpret or quantify the contribution of individual device or process parameters to the resulting electrical characteristics. In contrast, our work not only employs ML to assist TCAD modeling but also provides a detailed interpretability analysis, allowing us to identify the relative influence of each physical parameter on the output behavior.

Table 4 Overview of ML applications in TCAD-based device modeling.

Conclusion

In this work, we successfully demonstrate to use XAI techniques to understand design impacts on ID characteristics using an ANN approach for the first time. First, we show that our ANN model achieves high accuracy with Pearson’s correlation coefficients for Ion prediction and exceeding 0.99, validating its reliability for predicting electrical properties. Second, the SHAP analysis is conducted to evaluate the impacts of the processing designs on the full ID-VG (global interpretability) and normalized ID characterization (local interpretability). For global interpretability, the analysis of SHAP offers good precision, making SHAP a more effective tool for understanding model predictions. For local interpretability, the analysis confirmed that the SHAP values change consistently with the normalized on-state ID across different process parameters. By employing SHAP, we provide both global and local interpretability, revealing consistent correlations between key design parameters—such as channel length (Lch), oxide thickness (Tox), and p-well concentration (Pconc)—and the drain current. The SHAP results align with physical device principles, validating the reliability of the approach. Importantly, the integration of calibrated TCAD physics (including incomplete ionization and advanced mobility models) with SHAP analysis ensures that the interpretability results are grounded in realistic process conditions. Unlike conventional TCAD, which reveals parameter influence only through controlled variations, SHAP provides a quantitative decomposition of feature contributions across the dataset, enabling transparent ranking of design parameters and confirming consistency with device physics. In summary, the demonstrated results indicate the SHAP analysis using in XAI is promising to understand the impacts of the process designs in SiC MOSFETs, which can be further extended to wide range of the optimization in SiC MOSFETs.