Introduction

Non-suicidal self-injury (NSSI) involves deliberate and repeated harm to one’s body without suicidal intent and includes behaviors such as cutting, needle pricking, scratching, hitting, and burning1. Typically emerging during adolescence, NSSI is closely associated with suicidal ideation, and potentially serves as a precursor to suicide2. Additionally, NSSI frequently co-occurs with violent behaviors and has been linked to violent crimes alongside self-harm3. Studies have indicated that the prevalence of NSSI among adolescents and young adults is estimated to range from 18 to 38%4,5,6. NSSI significantly impacts the physical appearance and psychological well-being of adolescents and young adults. Early and accurate identification probability of NSSI is essential for prevention. For individuals at risk or those who have already engaged in NSSI, timely pharmacological and psychological interventions can prevent the continuation of these behaviors into adulthood7. Currently, antidepressants may be prescribed to regulate mood, while psychological interventions often involve intensive community care, cognitive behavioral therapy (CBT), and dialectical behavioral therapy (DBT), the latter demonstrating significant efficacy in addressing self-harm behaviors8. Detecting probability of NSSI in hospitalized adolescents and young adults with mental disorders (AYAMD) and implementing tiered prevention strategies based on probability levels of NSSI hold significant clinical and societal implications.

Accurately detecting probability of NSSI in hospitalized AYAMD poses a challenge for healthcare professionals. Adolescence is characterized by profound biological, psychological, and social changes, and represents a critical period for behavioral, cognitive, emotional, and physical development. This phase of life is marked by heightened vulnerability9. Despite extensive research, the mechanisms underlying NSSI remain unclear. Studies suggest that NSSI is associated with various factors, including psychosocial stressors, unhealthy emotional regulation, abnormal brain function, altered neurochemical metabolism, hypothalamic-pituitary-adrenal axis dysfunction, and altered pain perception10. Additionally, research has revealed correlations between NSSI and various serum hormones, including thyroid hormones and cortisol11,12. Research indicates that the most commonly reported factors for NSSI include female gender, a history of NSSI, and psychiatric diagnoses, alongside adverse childhood experiences, bullying, behavioral disorders, and somatic symptoms13,14. Poor mental health is closely associated with NSSI, irrespective of age or gender15. Incorporating these predictive factors into clinical models could significantly enhance the detection probability of NSSI in AYAMD.

Existing clinical tools for assessing probability of NSSI have notable limitations. A systematic review published in 2021 highlighted insufficient evidence for the reliability and validity of current tools in clinical settings, underscoring the need for higher-quality assessment tools16. While tools such as the Ottawa Self-Injury Inventory17 and Self-Injurious Thoughts and Behaviors Interview18 are widely utilized as a result of their comprehensive assessment capabilities19, their reliability may be compromised. For instance, patients can become disengaged during lengthy assessments, and parents may be reluctant to report their children’s past NSSI behaviors, affecting the reliability of the assessment results20. Moreover, single-item assessments and procedural evaluations may overlook individuals exhibiting atypical behaviors or those unwilling to self-identify as engaging in NSSI. Clinical interviews are also subject to recall and expectation biases as a result of the patients’ cognitive limitations or fear of punitive consequences21. Currently, few predictive models for NSSI in AYAMD exist, and no models specifically designed for use by healthcare professionals have been published. Therefore, there is a pressing need for a reliable detective model to identify probability of NSSI in hospitalized adolescents and support the development of effective intervention strategies.

Against this backdrop, the primary objective of this study is to analyze potential correlates of NSSI. Additionally, this study aims to develop a clinical detection model for assessing probability of NSSI in hospitalized AYAMD, providing a valuable tool for healthcare professionals.

Methods

Study design and participants

This retrospective study was conducted at a 4,500-bed university-affiliated medical centre in China. The methodology adhered to the updated version of Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis statement22. Eligible participants met the following inclusion criteria: (1) diagnosis of a mental disorder according to the International Classification of Diseases, 10th Edition, including depressive and anxiety disorders, affective disorders, schizophrenia, and childhood emotional disorders; (2) age between 15 and 29 years; and (3) hospitalization for treatment. Exclusion criteria included: (1) neurological diseases; (2) prolactinomas, pituitary tumours, or other endocrine system diseases; (3) organic mental disorders; and (4) incomplete clinical data with missing values for two or more intended variables. The sample size was determined utilizing the events-per-variable principle23, which is commonly utilized in the development of clinical prediction models and suggests that the ratio of patients experiencing clinical outcome events to the number of candidate predictor parameters should be at least five. Based on the formula N = EPV × X/P23, where N is the sample size, X is the number of predictor variables (24 in this study), and P is the prevalence rate of NSSI, we estimated a minimum sample size of 480 cases. This calculation assumed a 50% prevalence rate of NSSI, based on prior studies exhibiting a higher prevalence in clinical settings compared to community samples24. The EPV was set at 20 to ensure robust model development. Meanwhile, considering the large sample size required for machine learning.

NSSI assessment

NSSI was assessed following the criteria in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition1. Patients or their primary caregivers were interviewed about self-harming behaviours that may have occurred within the past 12 months, including (but are not limited to) cutting, needle pricking, scratching, hitting, and burning. Any occurrence of these behaviours during the specified period was classified as NSSI.

Brøset violence checklist

Given the potential correlation between aggression and NSSI, aggressive behavior was assessed utilizing the Brøset Violence Checklist (BVC), which assesses the risk of violence over a 24-hour period25. The BVC comprises six items—confusion, irritability, noise-making, physical threat, verbal threat, and object aggression—scored as 0 (none) or 1 (present). Total scores range from 0 to 6, with higher scores indicating greater risk. The BVC captures aggression toward oneself, others, and objects, providing a comprehensive tool for evaluating the risk of violence throughout the inpatient treatment process26.

Data access and collection

Data were extracted from the hospital information system. Collected variables included demographic and clinical characteristics such as age, gender, body mass index, marital status, education level, employment status, disease duration, hormone levels, hospitalization history, family history of mental illness, prior NSSI, suicide attempts, and psychotropic medication utilization. For patients with multiple admissions, data from the most recent admission were analyzed. Laboratory data primarily comprised prolactin, adrenocorticotropic hormone, plasma total cortisol, and thyroxine levels, measured from fasting blood samples taken between 7:00 and 8:00 AM on the morning following admission. Predictors included in the model, such as the number of previous suicide attempts, and other variables, were classified as categorical variables or continuous variables depending on their data types and clinical significance. In this study, the BVC and other assessments were administered by registered psychiatric professionals.

Quality control

The following series of measures to ensure the quality of the research wee utilized. First, the principal investigator conducted uniform training for all research personnel to ensure they understood the purpose of the study and their respective tasks. Second, pre-designed, standardized, and normalized data extraction forms that encompassed all required data items were utilized. Third, data collection was independently carried out by two designated researchers to ensure consistency, accuracy, and completeness in the collection methods. Discrepancies were resolved through consensus with the research leader.

Data analysis

Data analysis was performed utilizing SPSS (Version 26.0) and R (Version 4.1.3) software. For cases with missing data in ≤ 2 variables, the Multivariate Imputation by Chained Equations (MICE) method was applied utilizing the “mice” package in R, developed by Stef van Buuren. This approach handled missing values utilizing multiple imputation to enhance the completeness and reliability of the dataset for subsequent analysis. The “mice” package operates under the assumption of Missing at Random, where the likelihood of missing data is unrelated to the unobserved data but may depend on the observed data. The MICE iteratively impute missing values by modeling each incomplete variable as a function of the other variables, including the imputed values from prior iterations, until the imputed values stabilize across multiple iterations. Two-tailed p-values < 0.05 were considered statistically significant. Continuous data were statistically described utilizing mean ± standard deviation, while discrete data were statistically described utilizing percentages. Comparisons between training and validation datasets were conducted utilizing chi-square tests.

When designing predictive factor selection for the model, this study concurrently employed the machine learning Boruta algorithm and Logistic Least Absolute Shrinkage and Selection Operator (LASSO) regression analysis for data analysis. The Boruta algorithm, implemented utilizing the “boruta” package in R, is a feature selection method based on the random forest classifier. It injects randomness into the dataset by creating shadow features, which are shuffled versions of the original features. The algorithm iteratively compares the importance of each original feature with these shadow features to determine whether the original feature holds more significance. Features deemed less important than the shadow features are pruned, resulting in a minimal and optimal subset of features with greater stability than those selected by a single random forest model27. Initially, the Boruta algorithm was utilized to confirm important predictive factors28. Subsequently, the LASSO regression algorithm was applied to select clinically predictive factors from the risk characteristics of AYAMD patients. LASSO regression, conducted utilizing the “glmnet” package in R, is a regularization technique that assists with automatically eliminating non-influential variables from a potentially large and multicollinear set of predictors. This method applies a penalty (L1- norm or lambda, λ) to shrink some coefficients to zero, effectively removing irrelevant predictors, which assists with preventing overfitting, producing a more interpretable set of predictors29. The optimal λ was selected utilizing a 5-fold cross-validation30.

To integrate these outcomes into a single metric, we took the intersection of the features identified by both methods. For each intersecting feature, we selected the most significant variables based on their rankings in both methods. Finally, an integrated feature importance metric was created based on the feature importance ranking from the Boruta algorithm and the magnitude of coefficients from LASSO. Based on this metric, a nomogram prediction model was constructed. Developing a nomogram involves running a logistic regression model and then converting the results into a visual nomogram. In the R packages utilized, the nomogram and calibration curve were created utilizing the “rms” package. Decision Curve Analysis (DCA) was performed with the “dca.R” package. Additionally, the receiver operating characteristic curve (ROC curve) was plotted utilizing the “pROC” package. The “reportROC” package calculated the optimal cutoff for binary classification of NSSI probability.

Model performance evaluation included discrimination, calibration, and DCA. Discrimination was assessed utilizing the area under the receiver operating characteristic curve (AUC). Typically, AUC values of < 0.6, 0.6 to 0.8, and > 0.8 indicate poor, moderate, and good discrimination, respectively31. The ROC and AUC are commonly utilized to assess a model’s ability to differentiate between patients who experience an event and those who do not. While these metrics measure discrimination, they do not evaluate whether the model provides clinical benefit. Calibration, which assesses the consistency between detected probabilities and observed outcomes, complements discrimination by measuring the accuracy of absolute probability predictions. DCA was applied to evaluate the clinical utility of the model. Compared with traditional discrimination and calibration metrics, the advantage of DCA lies in its ability to provide guidance for the decision-making process regarding the use of prediction models in clinical practice32. DCA thresholds were predefined as 25 − 50% to align with clinical resource constraints.

Ethical considerations

The study was approved by the Institutional Review Board of West China Hospital of Sichuan University (approval number: 2023-517). Data confidentiality was maintained throughout, and the study adhered to the principles of the Declaration of Helsinki. The requirement for informed consent was waived by the Medical Ethics Committee as a result of the retrospective nature of the research.

Results

Participant characteristics

From October 2022 to March 2024, 658 patients who met the inclusion and exclusion criteria were enrolled. These patients were randomly assigned to either the training set (N = 461) or the validation set (N = 197) in a 7:3 ratio. Patients were further categorized into two groups based on their history of NSSI: those with a history of NSSI and those without. The overall NSSI prevalence was 50.9%, with rates of 51.8% in the training set and 48.7% in the validation set. The study flow and general patient characteristics are depicted in Fig. 1; Table 1.

Fig. 1
figure 1

Flow diagram of participant selection.

Table 1 Baseline characteristics of the training set and validation set.

Identification of predictive factors

The Boruta algorithm was employed to identify predictive variables, resulting in the selection of 10 significant predictors ranked by feature importance: history of suicide attempts, age, gender, psychiatric diagnosis, employment status, cortisol levels, education level, history of psychotropic medication use, history of antipsychotic medication use, and marital status (Fig. 2). Subsequently, LASSO regression analysis was performed to filter potential predictors, identifying four variables with non-zero coefficients: psychiatric diagnosis, history of suicide attempts, gender, and age (Fig. 3).

Fig. 2
figure 2

Importance of shadow variables and predictors selected by the Boruta algorithm.

Fig. 3
figure 3

The variable filtering process of the LASSO regression. (A) LASSO coefficient profiles of the features. (B) Optimal parameter (lambda) selection in the LASSO model used fivefold cross validation via minimum criteria.

Development of the nomogram

Based on the results of the Boruta and LASSO algorithms, four predictive factors were selected for the final model: age, history of suicide attempts, gender, and psychiatric diagnosis. The logistic regression analysis for predictors is detailed in Table 2. These variables formed a visual model of nomogram for detecting probability of NSSI in AYAMD (Fig. 4). The optimal cutoff value of 0.404 was calculated using Youden’s index, converting continuous NSSI probability into binary high/low probability labels for clinical use.

Table 2 Logistic regression analysis for predictors (N = 658). OR: odds ratio; CI: confidence interval; SE: standard errors.
Fig. 4
figure 4

The nomogram of predictors. ICD-10: International classification of diseases, 10th revision. F20–F29: Schizophrenia, schizotypal and delusional disorders; F30–F39: Mood [affective] disorders; F40–F48: Neurotic, stress-related and somatoform disorders; F50–F59: Behavioral syndromes associated with physiological disturbances and physical factors; F90–F98: Behavioral and emotional disorders with onset usually occurring in childhood and adolescence. To interpret the nomogram, first locate the value of each predictor on the corresponding scale and draw a vertical line upward to the “Points” bar to obtain the assigned points for that predictor. Next, sum the points for all predictors. Finally, draw a vertical line downward from the “Total Points” line to the “Probability of NSSI” scale to determine the detected probability of NSSI.

Validation of the nomogram

The AUC values for the training and validation sets were 0.803 (95% CI: 0.763–0.843) and 0.745 (95% CI: 0.676–0.814), respectively, indicating a moderate level of discrimination (Fig. 5). The Hosmer-Lemeshow goodness-of-fit test exhibited no statistically significant differences between the model’s predicted rate and observed NSSI rates, confirming good model fit (χ2 = 14.626, P = 0.067 for the training set; χ2 = 14.984, P = 0.061 for the validation set). Calibration curves further demonstrated high concordance between the predicted and actual NSSI probabilities (Fig. 6), underscoring the model’s accuracy. DCA curves for the training and validation sets indicated that within the pre-specified threshold range of 25 − 50%, the model generated significant net benefit, outperforming both the “All” and “None” strategies (Fig. 7).

Fig. 5
figure 5

Receiver operating characteristics curves of prediction model in the training and validation sets.

Fig. 6
figure 6

Calibration curves of prediction model in the training (A) and validation (B) sets.

Fig. 7
figure 7

Decision curve analysis of the detection model in the training (A) and validation (B) sets. The x-axis represents the threshold probability, and the y-axis represents the net benefit of detecting NSSI. The gray curve represents the net benefit assuming all patients are treated, while the black line assumes none are treated. The red and blue lines show the net benefit of using the nomogram in the validation and training sets, respectively. The embedded table lists the corresponding net benefit values across selected threshold probabilities in the validation set.

Discussion

This study introduced a visual detective model for evaluating the probability of NSSI among AYAMD patients, incorporating clinical factors such as age, gender, history of suicide attempts, and psychiatric diagnosis. The model demonstrated a good overall fit, satisfactory discrimination and robust calibration. It provides clinicians and nurses with a practical tool to detect probability of NSSI among hospitalized AYAMD patients on an individual basis.

The selection of predictive factors is a critical step in developing any predictive model, as it directly impacts the model’s quality. Identifying the most suitable predictive factors not only prevents overfitting but also reduces data collection costs, simplifies model structure, and saves computational time. These features make the model more interpretable, practical and easier to disseminate and apply33. A key strength of this study lies in its integration of the clinical realities of AYAMD patients with prior research on NSSI to identify candidate predictive factors. Machine learning algorithms, specifically with the Boruta algorithm and LASSO regression, were employed to screen and select the most relevant predictive factors for NSSI.

Prior research has established that NSSI typically emerges during adolescence, with risk decreasing gradually into adulthood34. Studies also highlight a strong association between suicide attempt history and NSSI, with patients exhibiting multiple suicide attempts being at higher risk of engaging in NSSI5,35. This behavior often serves as a means of alleviating internal distress36. Repeated suicide attempts may further normalize or facilitate NSSI behaviors37. To retain valuable information and reduce the risk of Type I errors, this study treated age and the number of suicide attempts as continuous variables rather than binary categories38. This approach allowed for greater precision in assigning predictive scores enhancing the model’s utility.

Few prior studies have developed predictive models based on clinical factors such as age, gender, history of suicide attempts, and psychiatric diagnosis, particularly in hospital settings. The majority of NSSI research has been conducted in community and school environments. There is a scarcity of predictive models developed specifically for clinical environments. Liang’s study39, developed a model for female patients with affective disorders (N = 396), utilizing marital status, age of onset, polarity of onset, first-episode psychotic symptoms, social dysfunction, and number of hospitalizations to build a model. Their model achieved a C-index of 0.73. Despite using only four predictors, our model achieved an AUC of 0.745 in the validation set and demonstrated significant clinical utility via DCA. By integrating clinical factors (e.g., intensive psychotherapy, pharmacotherapy, closed-ward care) and clinician preferences, the model evaluates net benefit across a 25–50% threshold range, dynamically balancing detection efficiency and risk of over-intervention. Threshold adjustment is recommended based on clinical context: 25% for high-risk screening and 50% for resource-constrained precision intervention, ensuring practicality and enabling rapid classification of patients into binary NSSI probability labels for clinical decision-making. For instance, at a 40% threshold probability (Fig. 7B), the model’s net benefit was 0.26, indicating that 26 true NSSI cases would be detected per 100 patients without increasing false positives. In contrast, the “All” strategy detected only 15 true positives, meaning the model provided 10 additional true positives compared to universal treatment, again without false positive increases40.

NSSI often occurs covertly, with many patients unwilling to disclose their behaviors voluntarily. In many cases, NSSI is discovered by parents or schools, prompting hospitalization41. Some patients may have already engaged in NSSI prior to admission42. Professional assessment of NSSI post admission can be cumbersome, potentially leading to adolescents developing adverse feelings towards the evaluation process, as well as introducing a time lag. The assessment results require analysis and correction by professionals, which is time-consuming and labor-intensive, leading to low diagnostic efficiency. Furthermore, patients at risk of NSSI are more likely to bring dangerous items—such as knives—into the ward, increasing the probability of secondary NSSI incidents in the hospital, requiring special attention from nursing staff43. This study advances assessment to the admission stage utilizing a simple and efficient method, providing medical staff with time and space to increase levels of protection and implement more targeted interventions for any patients identified as high probability of NSSI, thereby enhancing the clinical effectiveness of NSSI prevention. The observed rate of NSSI (approximately 50%) is indeed quite high, which poses certain challenges to any diagnostic algorithm. As highlighted by van Calster et al. (2018), when the baseline prevalence is very high, a model may face difficulty in reducing risk estimates to levels where patients would refuse treatment40. Integrating psychological scales could represent an innovative direction for future research to address model limitations in high-prevalence scenarios, especially applicable to mental disorders like NSSI that require personalized decisions.

The detective model can serve as a reference for clinical medical staff in decision-making and patient communication. Although Grimes44 pointed out that nomograms may be considered outdated in clinical practice as a result of the availability of simple computer programs, we believe they still hold value by providing readers with a clear and intuitive visualization of how the algorithm works. This is particularly beneficial compared to presenting only the formula with model estimates, such as intercepts and weights.

Additionally, prior studies have suggested a relationship between NSSI and dysregulation of the hypothalamic-pituitary-adrenal (HPA) axis45. However, in our study, although serum hormones related to the HPA axis, such as plasma total cortisol, adrenocorticotropic hormone, thyroxine, and prolactin, were included in the analysis, they were not selected as detective factors. This finding suggests a limited correlation between these hormones and NSSI in our dataset. Research on the association between serum hormones and NSSI remains preliminary, with numerous potential mechanisms yet to be fully explored and elucidated. Future studies with broader methodologies and deeper analyses are needed to uncover the biological basis of NSSI and provide scientific evidence for developing effective interventions.

Despite its contributions, this study has certain limitations. Firstly, the data were derived from a single-center case sample, potentially limiting the model’s generalizability. Future studies should validate and refine the model’s detective ability utilizing multi-center data. Second, while the clinical detection model effectively predicts probability of NSSI, it cannot replace the experience and clinical judgment of healthcare professionals. It should serve as an auxiliary tool, combined with the judgment of professional medical staff, to guide clinical decision-making rather than a standalone decision. Third, national, sociocultural, and environmental factors were not included, potentially reducing the model’s accuracy across diverse populations. Future research should consider incorporating sociocultural variables to ensure the model’s relevance in various sociocultural contexts. Fourth, interactions between predictors could reveal valuable insights and potentially enhance predictive accuracy. For instance, factors such as age, gender, and history of suicide attempts may interact in complex ways to influence NSSI. Future research should investigate these interactions to uncover additional insights and improve predictive accuracy for clinical use. Fifth, the use of a single random split of the sample to conduct the DCA is a limitation of this study. Although this method is straightforward, it may produce unstable or biased estimates of model performance due to the inherent variability in data partitioning. Future research should consider employing more robust resampling techniques—such as k-fold cross-validation or bootstrapping—to improve the reliability and generalizability of model evaluation46. Finally, the model relies solely on clinical data and does not account for psychological or behavioral factors. Incorporating qualitative research findings could further refine the model.

It is important to note that NSSI is just one factor considered by clinicians when designing treatment plans. Cognitive function, emotional state, sleep quality, and psychiatric symptoms are also critical. Therefore, while this model offers valuable guidance, it should be perceived as part of a comprehensive clinical assessment to inform treatment decisions.

Conclusions

This study highlights that age, gender, history of suicide attempts, and psychiatric diagnosis are key factors for detecting the probability of NSSI. The visual model developed from these factors provides an effective tool for estimating probability of NSSI among AYAMD aged 15–29. By integrating these predictive factors into clinical practice, the model has the potential to enhance workflow efficiency, optimize resource allocation, and alleviate the healthcare burden associated with managing NSSI. Importantly, this model supports early detection and timely intervention for high probability of NSSI patients, contributing to more effective mental health care delivery and improved clinical outcomes, ultimately promoting patient well-being.