Introduction

Self-injurious behavior (SIB) refers to intentional actions in which individuals cause harm to their own bodies, including suicide preparation, suicide attempts, and self-inflicted injuries. It specifically excludes mere thoughts, intentions, or inclinations toward self-harm that individuals may experience without acting upon them [1, 2]. As societal awareness of SIB increases, it is now recognized as a complex public health issue of global significance, its prevalence worldwide underscoring its intricate nature and urgent need for attention as a public health challenge [3,4,5]. SIB manifests globally at substantial rates, a meta-analysis reported that among 686,672 children and adolescents, the lifetime prevalence rates were 6% for suicide attempts, 22.1% for non-suicidal self-injury, and 13.7% for deliberate self-harm [6]. Besides, SIB can be highly treatment-resistant and the cost of care is burdensome [7]. According to the “Economic Cost of Injury — United States 2019” report [8], self-harm incidents in the United States incurred an economic cost of approximately $0.058 million per incident, whereas the cost attributed to suicide was significantly higher, averaging $9.74 million per individual. Therefore, establishing a multimodal prediction mechanism for early identification and prediction of SIB in adolescents is crucial to reduce the risk of future suicide and self-injury attempts and ensure that adolescents can achieve positive developmental outcomes, while also significantly reducing the macroeconomic burden.

In recent years, psychopathological theory has been applied to explain psychological behavior problems such as SIB. This theory conceptualizes mental disorders as dynamic networks composed of various symptoms and factors, where individual elements interact in a non-linear manner, collectively contributing to the development and persistence of SIB. Individual aspects, such as psychological health [9], emotional regulation capabilities [10,11,12], intertwine with familial environmental factors like perceived family functioning [13], domestic conflicts, parent-child relationship quality, and peer influences, as well as broader socio-cultural, social isolation, and trauma experiences, institutional, and environmental impacts exemplified by events like the COVID-19 pandemic [14,15,16], which directly or indirectly impact the likelihood of SIB. Additionally, pivotal nodes or mediating factors play an even more crucial role within the network of suicide risk, such as mental health issues, social isolation, and trauma experiences, which impact the likelihood of suicidal behavior. This theory underscores the importance of multi-factorial interaction, revealing the multi-dimensional relationships underlying suicidal behavior [17].

Combined with previous studies, we selected some representative factors affecting SIB in a biopsychosocial framework [18, 19]. Physiological factors, such as gender, age, autistic traits, and Attention-Deficit/Hyperactivity Disorder(ADHD) symptoms, significantly influence an individual’s susceptibility to SIB [20,21,22]. Psychological variables, including childhood happiness, loneliness, affinity for solitude, mind wandering, depression, generalized anxiety, and internet addiction, offer insights into the internal emotional and cognitive states that may contribute to SIB [23,24,25]. Additionally, social factors, such as subjective socioeconomic status, childhood economic conditions, nuclear family status, association with deviant peers, experiences of being bullied, co-positive experiences between teachers and students, teacher management of bullying, and childhood trauma, provide an understanding of the external environment and interpersonal relationships affecting behavior [26,27,28,29]. Integrating these variables through a biopsychosocial framework allows for a more comprehensive assessment of the multi-faceted risk factors associated with SIB. This holistic approach not only aligns with the theoretical foundations discussed but also enhances the accuracy of predictions and supports the development of more effective prevention and intervention strategies for SIB [30] (Table 1).

Table 1 Indicators of SIB risk factors.

From the perspective of research methods, while research on the risk factors for SIB among adolescents has been extensive, many studies have employed deductive methods, such as mediation and moderation models [31]. Although valuable, these approaches often rely on pre-established hypotheses for verification, which may lead to confirmation bias. To address the limitations of hypothesis testing, we attempt to integrate deductive reasoning with inductive inference. That is, we make hypotheses based on existing literature, guess which risk factors might be associated with SIB, and arrive at results based on data-driven inductive reasoning [32, 33]. This study employs a combination of machine learning and network analysis, methodologies that not only can discover patterns and associations within the data but also avoid the constraints of analysis based on preset hypotheses. Machine learning can handle large datasets, identifying complex, nonlinear, and dynamic relationships, which is particularly advantageous when dealing with multivariate and high-dimensional data. Moreover, network analysis offers a novel perspective for understanding the complex interactions between SIB risk factors, revealing the direct and indirect connections between variables and their roles within the entire network [17, 34]. This approach can provide a more comprehensive and accurate map of risk factors and also help to uncover previously overlooked or deemed insignificant factors. However, both methods have their strengths and limitations. While machine learning is powerful, its “black-box” nature makes the results difficult to interpret. On the other hand, network analysis provides an intuitive visualization of the relationships between variables but may fall short in capturing complex nonlinear interactions. In response, we propose the incorporation of the entropy weight method (EWM) to conduct integrated analysis [35]. The strength of EWM lies in its ability to quantify system uncertainty and effectively merge information from different analytical methods, thereby providing a more comprehensive and reliable assessment of risk factors. In this study, we use EWM to combine with network analysis and machine learning outcomes. By quantifying system uncertainty, EWM synthesizes insights from diverse methodologies, resulting in a more robust risk factor evaluation [36]. While centrality measures in network analysis (such as degree centrality, betweenness centrality, and closeness centrality) reveal the relative importance of risk factors within the network, they often fail to capture the complex nonlinear interactions between variables. By integrating the predictive capabilities of machine learning, EWM assigns dynamic weights to these centrality measures, reflecting not only structural characteristics but also potential nonlinear interactions within the data. This approach highlights the most influential risk factors and enhances the interpretability of machine learning models while preserving the strengths of network analysis. Moreover, this method can identify potential key nodes and edges that traditional approaches might overlook, offering new perspectives for research. For example, by combining factor weight analysis from machine learning models with network centrality, EWM can uncover variables that may have low centrality but high predictive power, thus providing stronger scientific support for a comprehensive understanding of SIB risk factors [19].

In conclusion, investigating SIB of adolescents is essential to understand and improve their mental health. Based on the bio-psycho-social model, entropy method was used to fuse the results of machine learning and network analysis. This study aimed to explore the network structure of risk factors associated with SIB in adolescents, find high-risk nodes, and prioritize their significance. The aim of this study is to provide a more comprehensive and accurate theoretical basis and practical application for the development of targeted and effective prevention and intervention strategies for SIB in adolescents.

Methods

Participants

In the investigation, questionnaires were systematically distributed among various secondary educational institutions in Hunan Province, China. We obtained the consent of the school administration, teachers and parents before issuing the questionnaire, and informed the students that they had the right to withdraw and refuse to answer the questionnaire at any time during the process of issuing the questionnaire. To ensure the uniformity and dependability of the gathered data, the principal investigator provided comprehensive training to classroom instructors on a uniform questionnaire administration protocol. This training aimed to minimize instructional disruption, with questionnaires being filled out during designated study periods. Subsequently, instructors collected and forwarded the completed surveys to the research team, adhering to a structured process.To ensure the validity and relevance of the data, specific inclusion and exclusion criteria were applied. Participants were eligible if they were enrolled as full-time students in secondary educational institutions within Hunan Province and were aged between 11 and 17 years. Exclusion criteria included students with cognitive or physical impairments that would hinder their ability to complete the questionnaire accurately, as well as students who had been absent during the survey administration. Additionally, students who declined to provide informed consent or withdrew from the study at any point during the process were also excluded from the final dataset. This ensured that the sample was representative of the target population while maintaining ethical standards in participant selection.

This rigorous approach to data collection resulted in the acquisition of 2047 legitimate responses from a demographic of students aged between 11 and 17 years, boasting an average age of 13.69 years (SD = 1.55). The gender distribution of the participants was 1088 males and 959 females, including 1259 junior high and 788 senior high students, providing a representative cross-section of adolescents within the specified age range. Ethical approval for this study was granted by the Ethics Committee of the National Clinical Research Center, Xiangya Second Hospital, Central South University (Approval No. (2023) Guo Lun Shen [Ke] No.036), ensuring compliance with ethical standards.

Measurement

Adolescent SIB, based on aggregated items related to self-harm, suicide attempts, and suicide preparation, is a concise and effective tool for assessing the frequency of SIB in adolescents. This tool includes items such as “Have you ever engaged in self-harm or attempted suicide?” to evaluate the risk of self-harm and suicidal behaviors among adolescents. In this study, responses were scored on a five-point scale according to frequency, with binary scoring applied during data processing. The study reports a Cronbach’s alpha of 0.85 for the simple scale composed of three items. Information on the measurement of other variables can be found in the Supplementary Materials.

Sample size estimation

According to the rule of thumb in machine learning, the sample size should be at least 10 times the number of features to ensure the stability and effectiveness of the model. In this study, the machine learning model has 19 features, so the required minimum sample size is:

$${\rm{N\_}}\min =10* {\rm{p}}=10* 19=190$$

In network analysis, a minimum sample size of 500 is generally recommended to ensure the stability and significance of network structure analysis. Considering 19 nodes in the network analysis, the following formula is used to calculate the minimum sample size:

$${\rm{N\_}}\min =({\rm{z\_}}{({\rm{\alpha }}/2)}^{2}* ({{\rm{\sigma }}}^{2}+{\rm{Var\_network}}))/({{\rm{Effect\; Size}}}^{2})$$

By substituting the significance level (α = 0.05, z_(α/2) = 1.96), the expected effect size (0.15), and the assumed network variance (1), the calculated minimum sample size is approximately 342 samples. The sample size in this study is 2047, which is sufficient to support stable machine learning and network analysis, ensuring the reliability and accuracy of the results.

Statistical process

Firstly, before machine learning, we preprocessed data involves cleaning the raw data, standardizing it, and dividing the dataset. The data cleaning process includes removing samples with significant missing values, handling outliers and duplicates, and imputing missing values. Continuous variables are standardized to ensure that their mean is 0 and their standard deviation is 1, allowing for comparability across different scales in subsequent analyses. Categorical variables are encoded into numerical formats. The dataset was then split into training and testing subsets in a 7:3 ratio to enable robust model training and evaluation. We applied six machine learning algorithms—Random Forest (RF), HistGradientBoosting (HGB), Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), LightGBM (LGBM), and K-Nearest Neighbors (KNN)—to predict SIB. Each model’s performance was assessed using several metrics: accuracy, AUC (Area Under the ROC Curve), sensitivity, positive predictive value (PPV), and Brier Score. Additional details are provided in the Supplementary Materials. This entire process was carried out using Python 3.11.7 and the “Scikit-learn” package.

Secondly, we conduct the network analysis in R Statistical Software [37]. The qgraph and bootnet R packages [38, 39] were utilized to estimate the network structure of SIB-related variables and the calculation of network centrality indicators. The EBICglasso estimator (tuning = 0.5) is used to generate a sparse network model, simplifying the network structure and highlighting the most important connections between variables. Centrality indicators such as degree centrality, betweenness centrality, and closeness centrality are used to evaluate the relative importance of each variable within the network. A Network Comparison Test was conducted with the NetworkComparisonTest R package [40], examining structural invariance, global strength, and edge invariance.

Thirdly, following the machine learning and network analysis, we use EWM to integrate the importance scores from the machine learning model with the centrality indicators from the network analysis. By calculating the entropy values for each variable in these two sets of results and combining them using a weighted sum approach, a composite ranking is generated. This allows for the identification of key variables that are both strong predictors and central to the network. EWM is a commonly used method for calculating weights in multi-indicator comprehensive evaluations (Fig. 1). The detailed steps and formulas of the EWM are provided in the Supplementary Materials.

Fig. 1
figure 1

Flow chart of the statistical analysis.

Ethics approval and consent to participate

All procedures involving human participants were conducted in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Ethical approval for this study was granted by the Ethics Committee of the National Clinical Research Center, Xiangya Second Hospital, Central South University (Approval No. (2023) Guo Lun Shen [Ke] No.036). Informed consent was obtained from all participants prior to their inclusion in the study, and the term “informed consent” was explicitly communicated and confirmed during the data collection process. For underage participants, written informed consent was obtained from their legal guardians through coordination with the respective schools and educational authorities.

Results

Demographic variables

The participants in the study ranged in age from 11 to 17 years, with an average age of 13.69 years (SD = 1.55). The sample consisted of 1088 males, accounting for 53.15%, and 959 females, representing 46.85%. High school students made up 61.50%, numbering 1259, while junior high school students constituted 38.50%, totaling 788 individuals. There were 728 only children, comprising 35.56%, whereas 1319, or 64.44%, had siblings. Ethnic minorities were represented by 133 individuals or 6.49%. The majority resided in urban areas, with 1843 people, or 90.03%, followed by those living in villages, at 7.77% with 159 individuals, and lastly, those from urban-rural fringe areas, at 2.20%, amounting to 45 people (see Table 2).

Table 2 Distribution of demographic variables of participants.

Machine learning

Based on the selected variables, six suitable machine learning algorithms were used to train the machine learning model. All models demonstrated commendable predictive capabilities for SIB (Table 3). Among them, the Random Forest algorithm emerged to be the most effective one, achieving an accuracy of 0.748 and an AUC of 0.814. The KNN algorithm, despite being relatively less effective, still achieved an accuracy of 0.728 and an AUC value of 0.757, indicating that all six models exhibited good overall fitting performance.

Table 3 Comparison of machine learning model performance for predicting SIB.

Given the outstanding performance of the Random Forest (RF) algorithm, we utilized it to evaluate the feature importance. This method enables a more detailed investigation into the key factors that predict SIB (Table 4). Figure 2 presents the ranked order of feature importance obtained from applying the RF algorithm to the classification task of predicting SIB within our study. In the figure, feelings of loneliness (L), ADHD, and being bullied (BB) are identified as the top three most important factors to predict SIB.

Table 4 Feature importance and cumulative AUC in predicting SIB.
Fig. 2: Feature importance and cumulative AUC for random forest.
figure 2

The cumulative AUC values represent the average of 10 independent repetitions of model training and performance evaluation. A 100-fold cross-validation strategy was used to ensure robust performance estimation.

Network analysis

Figure 3 shows the network structure of non-SIB and SIB groups. Both groups consist of 19 nodes, each with 81 non-zero edges out of a possible 171, resulting in a sparsity of 0.530. This indicates that approximately 53% of potential connections between variables are absent in both groups, reflecting a moderate level of network sparsity. While the overall network structures are similar, the specific connections and the strength of associations between variables may differ, warranting further analysis.

Fig. 3: Comparative analysis of network edge structure with or without SIB behavior.
figure 3

Network architecture of (a) SIB groups and (b) non-SIB groups.

By examining the edge weight matrix, we identified notable differences in variable interactions between the non-SIB and SIB groups. In the SIB group, strong positive correlations were observed between autistic traits and ADHD symptoms (0.272 and 0.177, respectively), as well as between mind wandering and ADHD symptoms (0.243), and depression and anxiety (0.439). The SIB group also showed significant associations between internet addiction and anxiety (0.301), and loneliness and internet addiction (0.171), indicating their importance in the SIB risk network. Positive teacher-student interactions were also relevant, with a weight of 0.195 between these interactions and ADHD symptoms, suggesting their potential role in mitigating SIB. Negative connections included the inverse relationship between gender and deviant peer association (−0.148), and gender and bullying (−0.128).

Comparative analysis of network centrality (Fig. 4) revealed that ADHD symptoms exhibited high centrality in both groups, particularly in the SIB group (1.852), indicating its strong predictive value. Loneliness (1.585), autistic traits (0.877), and depression (0.971) also had high centrality in the SIB group, highlighting their critical roles in SIB. Mind wandering had a centrality of 0.749, suggesting its influence. In contrast, variables like Nuclear Family Status (−1.155), Objective Economic Status (−0.970), and Childhood Happiness (−0.997) showed low centrality, indicating a more peripheral role in the SIB network. Positive teacher-student experiences had a centrality of 0.286, hinting at its emerging importance in the SIB context.

Fig. 4
figure 4

Comparison of risk network centrality with or without SIB.

To further validate the robustness of our network analysis results, we conducted a bootstrap analysis with 1000 resamples, assessing the stability of edge weights and centrality values for each node. The bootstrap method, by repeatedly sampling from the original dataset and recalculating the edge weights and centrality metrics, allowed us to evaluate the stability and reliability of the observed relationships and centralities across the network (Figure S1 and S2).

Entropy analysis

In this study, we used EWM to calculate the Comprehensive Scores for various features to assess their relative importance and centrality within the feature network for predicting SIB (Fig. 5). The results showed that Loneliness (L) had the highest Comprehensive Score of 0.856, underscoring its critical importance in SIB prediction. ADHD Symptoms and Internet Addiction (IA) followed with scores of 0.825 and 0.692, indicating a key role in predicting SIB. Additionally, Depression(DEP), Anxiety (ANX), Affinity for Solitude (AS), Autistic Traits(AT), and Being Bullied(BB) had scores of 0.523, 0.510, 0.471, 0.464, and 0.438, respectively, which also showed their influence in SIB prediction. Variables with Comprehensive Scores lower than 0.4 were considered relatively less important, meaning they had a smaller influence in predicting SIB.

Fig. 5
figure 5

Comprehensive Entropy Score with Contributions of Centrality and Importance.

Discussion

The current study investigates the complex and multifaceted nature of SIB among adolescents, emphasizing the interconnectedness of psychological, physiological, and social factors. Understanding these risk factors is essential for developing more effective prevention and intervention strategies. This section explores the key findings regarding the central psychological, physiological, and social elements contributing to adolescent SIB, and reflects on the research methods used to capture the intricate dynamics of these risk factors. Through an integrated approach combining machine learning, network analysis, and entropy weighting, we offer a comprehensive view of the factors influencing SIB and discuss their implications for both theory and practice.

Psychological factors and SIB

Loneliness, depression, and anxiety are central psychological factors driving adolescent SIB. Across all analyses, loneliness exhibited exceptionally high centrality, indicating its dominant role within the SIB risk network. Loneliness is not merely a perceived sense of social isolation but is also linked to significant neurological changes in brain regions responsible for social processing, such as the superior temporal sulcus and anterior cingulate cortex [41]. These neurological changes may impair emotional regulation, leading to social cognitive biases and gradually accumulating as chronic psychological stress [42]. Over time, this chronic stress and the buildup of negative emotions can exacerbate internal distress, ultimately making SIB a means of escaping or alleviating emotional turmoil.Affinity for solitude may lead to SIB because individuals are more likely to confront negative emotions when alone and lack external support to effectively regulate these emotions [43]. Although solitude does not equate to loneliness, prolonged solitude may cause individuals to become immersed in self-criticism and negative feelings, which can lead to self-harm as a way of coping with emotional distress [44]. Depression and anxiety symptoms are typically accompanied by persistent negative emotions [45], such as profound sadness and intense anxiety, which may drive individuals to engage in SIB as an attempt to regulate or distract from these overwhelming feelings [46]. Additionally, depression and anxiety can lead to cognitive distortions, such as negative self-evaluation, extreme pessimism about the future, and underestimating one’s problem-solving abilities, further amplifying psychological distress and increasing the risk of SIB [47]. Therefore, these psychological factors do not operate in isolation but interact in complex ways to collectively drive the occurrence of SIB.

Physiological factors and SIB

In the analysis of physiological factors, symptoms of ADHD and autistic traits are key to understanding adolescent SIB. ADHD, as a neurodevelopmental disorder, profoundly impacts self-control, particularly in regulating impulses and maintaining attention, often attributed to frontal lobe dysfunction [48, 49]. This impulsivity may lead adolescents to resort to SIB as a quick means to relieve internal emotional fluctuations and conflicts [50]. Although ADHD symptoms in this study are treated as behavioral traits rather than clinical diagnoses, their significant influence on SIB aligns with extensive literature identifying ADHD as a risk factor for SIB [51]. On the other hand, autistic traits primarily affect adolescents’ social and communication skills, making them more prone to misunderstanding and rejection, which can result in severe psychological distress [52]. Particularly in the context of intense loneliness, these social impairments render autistic adolescents more vulnerable, potentially leading them to use SIB as an extreme coping mechanism to deal with complex emotions [53]. More concerning is that adolescents may encounter online content that encourages self-injury, such as the “Blue Whale Game,” which normalizes and even promotes SIB behavior, making it easier for adolescents to perceive self-harm as a way to resolve their distress [54]. These physiological factors reveal how neurodevelopmental characteristics can influence psychological and behavioral processes, ultimately contributing to the occurrence of SIB.

Social factors and SIB

Social factors also play a crucial role within the SIB risk network, particularly experiences of bullying significantly increase the risk of developing SIB in adolescents. Among adolescents experiencing various forms of bullying victimization (physical, verbal, and relational), the risk of engaging in self-injurious behavior is significantly elevated. This increased vulnerability is often compounded by additional challenges such as depression and a lack of social support, which further exacerbate the impact of these negative experiences and contribute to the development of unhealthy coping mechanisms like self-harm [55]. This social factor illustrates how environmental and experiential elements can alter an individual’s psychological state, ultimately leading to SIB.

Reflection on research methods

Many studies exploring SIB risk factors in the existing literature rely on traditional statistical methods, such as regression analysis or mediation/moderation models. While these methods effectively reveal direct relationships between variables, they are significantly limited in capturing the complex nonlinear and dynamic interactions between variables [56]. Moreover, traditional approaches often assume the independence of risk factors, overlooking the intricate network structure of interactions across social, psychological, and physiological dimensions [57]. Although some studies have attempted to apply single machine learning methods [58] or text analysis methods [59] to detect suicide risk, the lack of interpretability and systematic integration often hampers the ability to provide actionable intervention recommendations. Additionally, existing research has rarely considered how to effectively combine results from various analytical methods to enhance the robustness and accuracy of conclusions [60]. This study addresses these gaps by integrating multiple data analysis tools to reveal the complex risk structure of adolescent SIB. First, machine learning techniques are employed to identify and predict key risk factors for SIB, offering the advantage of handling multidimensional and high-complexity datasets and capturing nonlinear and dynamic relationships [61]. However, the “black-box” nature of machine learning [62] poses challenges for model interpretability, which is why this study also introduces network analysis to construct a relational network among variables, visually displaying the interactions and centrality of various factors within the risk network. By incorporating the EWM, the study integrates results from both machine learning and network analysis, assigning dynamic weights to variables to derive a more comprehensive and reliable ranking of key risk factors [63, 64].

Entropy weighting has been widely used across disciplines like economics and environmental studies to integrate weights from multiple criteria, making it suitable for complex datasets. It assigns importance to variables based on their distribution, ensuring an objective combination of results from machine learning and network analysis. This method provides a more reliable ranking of risk factors by considering each factor’s contribution proportionately.Future studies should consider the quality and consistency of data, as inaccurate data can lead to biased results. Researchers should also be cautious of overfitting in machine learning models, using cross-validation techniques to improve generalizability. Lastly, while this approach improves accuracy, enhancing model interpretability will be crucial for deriving practical, actionable insights.

Implications

Psychological implications

At the psychological level, loneliness, and symptoms of depression and anxiety have been identified as core risk factors for adolescent SIB. Loneliness is one of the most significant predictors [65, 66], reflecting not only emotional isolation but also the accompanying psychological stress and emotional distress, which significantly increase the risk of SIB [67, 68]. Schools and communities can mitigate the negative impact of loneliness on adolescents’ mental health by organizing social activities and establishing supportive groups. For example, the “Circle of Friends” program has demonstrated significant improvements in participants’ social integration and mental health through peer support and group interaction [69]. Additionally, depression and anxiety symptoms are closely linked to SIB, often manifesting as persistent negative emotions and cognitive distortions, such as pessimism about the future and negative self-evaluation [23]. A combination of cognitive-behavioral therapy (CBT) and pharmacotherapy has been widely proven to effectively alleviate these symptoms, thereby reducing the risk of SIB [70].

Physiological implications

On the physiological front, symptoms of ADHD and autistic traits have a pronounced impact on SIB. ADHD symptoms, characterized by impulsivity and inattention, are often associated with frontal lobe dysfunction, making adolescents more prone to impulsive, often self-destructive behaviors to cope with emotional fluctuations [48]. Interventions combining behavioral therapy, pharmacotherapy, and CBT have proven effective in reducing ADHD symptoms, thereby lowering the incidence of SIB [71]. For instance, the “Family Training Program” has shown significant efficacy in reducing behavioral problems in children with ADHD, improving both impulse control and emotional management [72]. Autistic traits primarily affect social and communication skills, leading to greater social exclusion and misunderstanding, which exacerbates psychological distress [52]. Particularly in cases of intense loneliness, these social impairments make autistic adolescents more likely to adopt SIB as a coping mechanism [73]. Interventions for this issue should focus on social skills training and personalized support plans, such as Applied Behavior Analysis (ABA) and social stories [74], which have been effective in improving social interaction and reducing social exclusion in autistic adolescents [75].

Social implications

Social factors also play a critical role in SIB risk, with the impact of bullying and childhood abuse being particularly significant. Bullying, especially during childhood, has profound effects on adolescent mental health. Studies indicate that adolescents who experience early bullying face a higher risk of SIB [76]. The sustained psychological stress and social rejection caused by bullying severely damage the victims’ self-esteem and self-efficacy [77], increasing their negative emotions and depressive symptoms, which often drive them to seek emotional relief through SIB. Schools should actively create supportive environments and implement comprehensive anti-bullying programs, such as Finland’s “KiVa” anti-bullying program, which has significantly reduced bullying incidents and improved overall school climate through counseling services and peer support [78]. For adolescents who have experienced childhood abuse, trauma-informed care, such as Trauma-Focused Cognitive Behavioral Therapy (TF-CBT), is a crucial intervention, helping them process and overcome trauma, thereby reducing the incidence of SIB [79].

Limitations and future directions

Despite uncovering the complex risk structure of adolescent SIB through the integration of multiple data analysis tools, this study has several limitations. First, the reliance on cross-sectional data means that we can only observe associations between variables at a single point in time, without establishing causality or tracking their dynamic changes over time. Future research should consider longitudinal designs to explore the causal relationships between risk factors and SIB more deeply. Second, much of the data in this study is based on self-reporting, which may introduce social desirability bias or memory distortion, potentially affecting the accuracy of the results. Future studies could incorporate physiological measurements or third-party assessments to provide more objective data. Additionally, while we employed advanced analytical tools such as machine learning and network analysis, the complexity of these methods may limit their interpretability and practical application in clinical settings. Although the development of Explainable AI (XAI) can partially address this issue, further efforts are needed to make these methods more practical and user-friendly in clinical practice [80].

Future research should address current limitations and enhance understanding of SIB through longitudinal studies, which can clarify causal links and developmental paths related to risk factors. To enhance the accuracy and reliability, future research should attempt to integrate multiple data sources, including physiological data, behavioral observations, and third-party reports, to offer a more comprehensive perspective on SIB. Additionally, methodologically, further exploration of Explainable AI (XAI) could help us better understand the outputs of complex models, making the research findings more accessible and applicable to clinicians and policymakers. Ultimately, these efforts will provide a solid scientific foundation for developing more precise and personalized intervention strategies, helping to more effectively prevent and mitigate adolescent SIB and improve their mental health and social adaptability.

Conclusion

This study, through the comprehensive application of machine learning, network analysis, and the Entropy Weight Method, has revealed the complex risk structure of adolescent SIB, highlighting the multidimensional interactions of psychological, physiological, and social factors. The findings indicate that loneliness, ADHD symptoms, Internet addiction, anxiety, depression, affinity for solitude, autistic traits, being bullied play crucial roles in the occurrence of SIB. Although this study provides important theoretical and empirical support for understanding the complex etiology of SIB, challenges remain in establishing causality, addressing the limitations of self-report data, and improving the interpretability of models. Future research should deepen the understanding of SIB through longitudinal designs, multi-source data integration, and the application of Explainable AI techniques, providing a more robust scientific basis for developing personalized intervention strategies. Through these efforts, we hope to offer stronger support for the mental health and social adaptation of adolescents, thereby effectively reducing the incidence of SIB.