Abstract
Nonsuicidal self-injury (NSSI) in youth is clinically heterogeneous. We aimed to identify distinct psychopathology-based profiles among children and adolescents reporting NSSI and their longitudinal correlates. Participants (N = 1 345) were drawn from the Brazilian High-Risk Cohort Study, which conducted extensive phenotypic assessments at baseline (ages 6–14 years) and across two follow-up waves (ages 9–18 and 13–23 years). First, we applied unsupervised machine-learning algorithms (Self-Organizing Maps and k-means clustering) to identify distinct psychopathology-based profiles among youth reporting NSSI at the second follow-up. We then employed three models to identify longitudinal predictors of these profiles: logistic regression, elastic net, and random forest. Analyses revealed two distinct profiles of youth reporting NSSI, characterized by high and low psychopathology. The high psychopathology profile (n = 117) was associated with factors identifiable earlier in life and characterized by persistent psychiatric symptoms and significant social adversity throughout development (e.g., family problems and bullying). The low psychopathology profile (n = 127) was marked by lower overall psychopathology and experienced mental health problems only later in development, with less severe challenges over time, such as school suspension and milder depressive symptoms. While the logistic regression did not provide overall significant performance, the elastic net (AUC = 0.72 95% CI 0.65–0.77) and random forest (AUC = 0.73 95% CI 0.67–0.78) did. The present study identified two distinct psychopathology-based profiles among youth reporting NSSI and their longitudinal correlates, using machine learning approaches. Early identification of youth in higher-risk profiles can inform early intervention strategies.
Similar content being viewed by others
Introduction
Nonsuicidal self-injury (NSSI) refers to deliberate self-inflicted bodily damage without the intention to die [1]. NSSI is prevalent among youth, with observed rates of 17.4 and 13.4% in adolescents (10–18 years) and young adults (18–24 years), respectively [2]. This causes serious concern for clinicians, families, and policymakers as those who self-injure are more likely to engage in suicidal behavior [3], which is one of the leading causes of mortality among youth, and to develop poor mental health outcomes, such as depression and substance abuse [4, 5]. Given these strong associations and the early emergence of NSSI, these behaviors can serve as an early signal of the youth most in need of interventions to curb the development of complex psychopathology and suicidality during a sensitive period of brain development [6, 7]. Therefore, it is essential to study the correlates and predictors of NSSI to identify at-risk populations and inform early intervention measures.
Systematic reviews and meta-analyses have identified various risk factors for self-harm that span multiple domains: demographic (e.g., female sex), psychological (e.g., perfectionism), psychopathological (e.g., depression), behavioral (e.g., substance use), and environmental (e.g., bullying) determinants [4, 8, 9]. Key risk factors that have been consistently identified include mental disorders, adverse childhood experiences, bullying, and family factors (e.g., harsh parenting) [8, 9]. Emerging research, however, highlights the role of protective factors such as a sense of belonging, positive relationships with parents, and community involvement [10].
Several theoretical models link early developmental risk factors, such as temperamental vulnerability (e.g., biological emotional reactivity) and adverse experiences (e.g., maltreatment), to more immediate psychological factors, such as low self-esteem and stress [11,12,13]. The Integrated Theoretical Model is particularly relevant, as it synthesizes both developmental pathways and concurrent mechanisms [11]. For this model, NSSI results from a complex interaction between distal vulnerabilities (e.g., biological predispositions, environmental stress, childhood adversity) and proximal stressors. These transactions contribute to heightened emotional and cognitive dysregulation, culminating in self-injury. This theoretical model can serve as a template for examining how diverse risk factors influence each other over time to predict different profiles of youth who engage in NSSI.
The cumulative and complex interactions of NSSI’s predisposing factors make predicting NSSI challenging [14]. Standard statistical models often yield predictions similar to chance [15]. Among the limitations of such models are the failure to account for the interactions between risk factors over time and the constraint of evaluating only a limited number of potential predictors [16]. A major challenge, therefore, is not only accurately predicting NSSI but also better understanding its multidimensional nature. Machine learning (ML) approaches have the potential to overcome these limitations by jointly modeling many variables and their interactions in high-dimensional data, which allows for more complex models, data-driven variable selection, and the capture of subtle relationships, features that standard models generally lack [17]. Although several ML models have been developed to predict self-injurious thoughts and behavior, most of the extant literature has focused on suicidality [16]. The few ML models that were developed to predict NSSI either rely on cross-sectional or short-term longitudinal studies, both of which are suboptimal for identifying robust causal risk factors [18,19,20]. Additionally, some studies rely on clinical samples that do not represent all individuals who engage in self-injury [21, 22]. Understanding how these factors are prospectively associated with NSSI in the community is essential for identifying predictors over time, which can better inform early intervention practices that are generalizable to all individuals who engage in NSSI and not only to a small fraction of those seeking help.
Longitudinal community studies using ML approaches have the potential to identify distinct profiles among individuals reporting NSSI and their developmental correlates through a data-driven approach. Leveraging data from the Millennium Cohort Study, a multi-wave cohort from England, Uh et al. found two groups of adolescents engaging in self-harm, each with distinct developmental characteristics [23]. Both had low self-esteem and sleep problems, but one was characterized by higher levels of psychopathology and the other by adolescent risky behavior and the absence of significant psychiatric symptoms. The high psychopathology group was associated with more evident risk factors, including early and persistent emotional dysregulation, adverse experiences, and caregiver mental health problems. The low psychopathology group had less clearly distinguishable earlier risk factors but appeared to face challenges such as poorer peer and family relationships as they entered adolescence. Although these profiles and their associated factors converge with the literature, these findings require replication and validation, especially in low- and middle-income countries [24, 25].
Additionally, several studies have also identified symptomatic differences among subgroups of individuals with NSSI, both quantitatively (from higher to lower levels of psychopathology) and qualitatively (e.g., impulsive, anxious, pathological, and “non-pathological” groups) [26, 27]. This also complicates prediction because, based on the limited studies available, distinct risk factors associated with different subgroups leading to the same outcome (i.e., NSSI). These existing studies also indicate that, although strongly associated with adverse mental health outcomes and suicidality, NSSI does not necessarily imply the presence of a formal mental disorder. NSSI may serve as a functional purpose, primarily as a maladaptive coping strategy to regulate intense emotional distress (automatic reinforcement) or navigate interpersonal challenges (social reinforcement) [11, 28].
Therefore, we aimed to identify distinct psychopathology-based profiles among youth reporting NSSI in a Brazilian community-based sample, employing a ML pipeline similar to that used by Uh et al. [23], and to determine the longitudinal predictors and concurrent correlates of these profiles. First, we identified data-driven profiles of individuals who engage in NSSI based on their clinical psychopathology. Then, we examined these profiles’ longitudinal predictors and concurrent correlates using logistic regression (LR) and two ML algorithms: Elastic Net (EN) and Random Forest (RF). Additionally, we extended previous findings by incorporating further evaluations, including categorical diagnoses and neuropsychological assessments.
Methods
Sample
This study included participants of the Brazilian High-Risk Cohort (BHRC), a community school-based longitudinal study. The BHRC utilized an enriched risk design, specifically consisting of oversampling of child and family psychopathology to increase the prevalence of risk factors and outcomes of interest. The complete design, sampling procedure, and methods have been described previously [29]. Briefly, the recruitment occurred at 57 public schools in two Brazilian metropolitan areas (35 in São Paulo; 22 in Porto Alegre) selected for proximity to research centers and for having more than 1 000 students in the target age range. In the first stage, eligible children (ages 6–12) were registered by a biological parent who was the main caregiver and completed the Family History Screen (FHS). At screening, approximately 12 500 families were approached, and 9 937 children from 8 012 families were interviewed. In the second stage, a subsample was randomly selected from the eligible pool to represent the community; among the remainder, high-risk children were ranked by a Family Liability Index (FLI; derived from the FHS) and invited with replacement until the fixed target was reached (budget limit, rather than an a priori power calculation), prioritizing higher FLI scores. This oversampling successfully increased the prevalence of child and family psychopathology relative to the random stratum.
The baseline (2010 and 2011) sample included 2 511 (random subsample = 957; high-risk subsample = 1 554) children and adolescents 6–14 years of age (M = 10.20, SD = 1.90). From this initial sample, all parents completed the household interview (e.g., diagnostic, risk factors interview). Child testing (e.g., neuropsychological testing, temperament evaluation) was completed by 2 401 participants (test response: 95.6%). There were two follow-up waves: one conducted in 2014–2015 (FU1; ages 9–18 years, M = 13.50, SD = 1.92, retention rate = 80.0%) and another in 2018–2019 (FU2; ages 13–23 years, M = 18.33, SD = 2.01, retention rate = 76.1%). Details on participation and non-response (including reasons for ineligibility and non-attendance) are provided elsewhere [29].
The analytic sample of the present study consisted of 1 345 (70.4% of FU2) individuals who had data available for the three outcome-defining variables (see Outcome Measures) across all three assessment waves (see Supplementary Table 1 for a comparison of included and excluded participants based on data availability and Supplementary Figure 1 for a participant flow diagram). Of these, 244 (18.1%) reported NSSI and 1 101 (81.9%) did not.
Procedures
The ethics committee of the University of São Paulo approved the study and written informed consent was obtained from the parents of all participants. Data collection occurred at three waves as described above. The assessment phase followed a standardized sequence involving household interviews with caregivers and school-based evaluations with children. Parent interviews were conducted by trained lay interviewers, and youth evaluations were conducted by trained psychologists and speech and hearing pathologists. Rigorous training and standardization protocols were implemented to ensure data quality and minimize measurement error across sites and over time. Lay interviewers underwent extensive standardized training (e.g., an intensive two-day program for the parental interview phase) supervised by clinical psychiatrists. This training covered project design, psychiatric syndromes, risk factors, psychopathology, ethics, standardized administration, and interview techniques. Training included simulations, role-playing, and competency evaluations based on videotaped interviews; interviewers not meeting performance standards were excluded. Quality control measures included ongoing supervision throughout data collection periods. For diagnostic assessments, clinical ratings were performed by nine certified child psychiatrists who reviewed all interview data. They were supervised by a senior expert, and consensus discussions were held for uncertain cases. Neuropsychological assessments followed standardized protocols [29].
Covariates
Age, study site, sex-at-birth (by parent report), and socioeconomic status (SES) were used in all models. SES was assessed with a standardized instrument validated in Brazil [30].
Outcome measures
The study outcome was group membership (two psychopathology-stratified NSSI groups or the Comparison group) based on three measures: NSSI presence/absence and two psychopathology measures (see Statistical Analysis). NSSI was evaluated with the Deliberate Self-Harm Inventory, Nine-Item Version (DSHI-9) [31] at FU2. The DSHI-9 assesses several types of NSSI behaviors during the preceding 6 months (on a scale from 0 = never to 6 = more than five times). In the present study, NSSI was dichotomized as present (endorsement of any of the nine DSHI-9 behaviors, i.e., a total score > 0) or absent. In adolescents, the DSHI-9 shows adequate test-retest reliability and concurrent validity [31]. In the current study, internal consistency was excellent (McDonald’s omega ω = 0.93). Two measures were also employed to characterize psychopathology at FU2 as in the prior research by Uh et al. [23]. The Strengths and Difficulties Questionnaire (SDQ), a 25-item screening measure of mental health and behavioral problems [32], was used to measure five dimensions of psychopathology (emotional, conduct, hyperactivity/inattention, peer problems, and prosocial behaviors). The SDQ has demonstrated good internal consistency, test-retest and criterion validity [33, 34]. In the current sample at FU2, SDQ has demonstrated excellent internal consistency (ω = 0.905). The Short-Mood and Feelings Questionnaire (SMFQ) [35], a self-report measure of depressive feelings and behaviors, was used to measure the total score of depressive symptoms and has shown excellent internal consistency in this cohort (ω = 0.96) and good accuracy for internalizing conditions (Area Under the Curve [AUC] > 0.80) [36].
Predictor measures
Psychiatric diagnosis
Mental health diagnosis was assessed by the Brazilian version of the Developmental and Well-being Assessment (DAWBA) [37], a semi-structured interview administered to caregivers by trained lay interviewers [38]. Inter-rater agreement was above 90% for all diagnosis, with kappa values ranging from 0.72 for hyperkinetic disorders to 0.84 for emotional disorders [39].
Dimensional symptoms
Caregivers completed the Brazilian version of the Child Behavior Checklist (CBCL/6–18) [40], a 113-item measure of emotional and behavioral symptoms in children and adolescents aged 6–18 years, answered on a three-point scale (0 = not true; 1 = somewhat or sometimes true; 2 = very true or often). The measure yields eight syndrome scales: anxious-depressed, withdrawn-depressed, somatic complaints, rule-breaking behavior, aggressive behavior, social problems, thought problems, and attention problems. Participants aged 18 years and older at the second follow-up were administered the Adult Behavior Checklist (ABCL) [41], a 121-item measure that is completed by a close informant (e.g., partner, parent, friend) [42]. The CBCL and ABCL demonstrated strong reliability in the current sample across all waves. Internal consistency for the syndrome scales was consistently high (CBCL range: ω = 0.86 to 0.95; ABCL range: ω = 0.88 to 0.96).
Intelligence and cognition
Intelligence Quotient (IQ) was estimated using the Vocabulary and Block Design subtests of the Wechsler Intelligence Scale for Children, 3rd edition (WISC-III) [43], applying the Tellegen and Briggs method with Brazilian norms [44, 45]. Executive function was assessed by neuropsychological tasks, including working memory (Digit Span, Corsi Blocks), inhibitory control (Go/No-Go, Conflict Control Task), and temporal processing tasks [29]. These measures were aggregated into a second-order executive function factor model showing excellent fit (root mean square error of approximation [RMSEA] = 0.004; comparative fit index [CFI] = 0.999) [46].
Parental psychopathology
The validated Brazilian version of the Mini International Neuropsychiatric Interview (MINI) was used to assess parental psychiatric history [47].
Additional instruments
Other predictors, such as temperament, family environment, maltreatment, substance use, and well-being, are described in the Supplementary Methods.
Statistical analysis
Data-driven clustering (unsupervised learning)
We characterized the psychopathological profile of youth who engage in NSSI in the second follow-up wave using two unsupervised ML algorithms, Self-Organizing Maps (SOM) and k-means, as described in prior studies [48]. A SOM is an unsupervised neural network algorithm for dimensionality reduction and was used to map individuals with similar emotional and behavioral profiles onto a two-dimensional grid of nodes, preserving the topography of the input data [49]. To build the SOM, we utilized the z-standardized scores from the five SDQ subscales (emotional, conduct, hyperactivity/inattention, peer problems, and prosocial behaviors) and the SMFQ total score. These inputs were selected because they provide comprehensive measures of psychopathological domains that characterize NSSI individuals.
We then added a visualization layer to display the location of individuals with NSSI on the map (Supplementary Figure 2). To investigate the existence of distinct regions in the SOM and provide an objective partition of the data (which would map onto distinct subgroups of individuals within the sample), we employed k-means clustering with the SOM nodes’ weights. This two-step approach is advantageous, as the SOM simplifies the data structure and reduces noise prior to clustering, leading to more stable partitions.
Subgroup comparisons and longitudinal psychopathology trajectories
To further validate the psychopathological differences between groups, we compared SMFQ scores cross-sectionally at FU2 using the Kruskal-Wallis test and SDQ scores longitudinally using generalized estimating equations (GEEs). For the GEEs, we fitted a Poisson regression model with a log link function to accommodate the count nature of SDQ scores. Our model included an interaction term between the time points and the NSSI groups, enabling us to investigate whether the SDQ scores varied by group and over time. We also accounted for within-individual correlation by specifying an autoregressive correlation structure (AR(1)), reflecting the assumption that measurements closer in time are more correlated than those further apart.
Longitudinal correlates of subgroup membership (supervised learning)
We used a supervised machine learning pipeline to identify the longitudinal correlates that distinguish between different psychopathology-stratified NSSI groups and the comparison group using a multiclass classification (one-vs-rest) within a bootstrap multiple imputation (Boot MI) framework [50]. A comprehensive set of variables was selected for this predictive modeling stage, as detailed in Supplementary Tables 2-4, encompassing 82 variables at baseline, 92 at FU1, and 89 at FU2. Predictors were drawn from the BHRC longitudinal multimodal assessments, which include questionnaires, semi-structured interviews, neuropsychological testing, and self-reports. This approach was adopted guided by existing literature (i.e., considering the integrative theoretical model, risk factors and correlates studies) and to minimize the risk of omitting relevant variables that span five domains: (1) identity (e.g., demographics), (2) internal/psychological (e.g., temperament, cognition), (3) psychopathology and psychiatric medications (e.g., diagnoses, treatment), (4) behavioral (e.g., substance use), and (5) social context/environment (e.g., family dynamics, bullying, life events) [4]. Several of these predictors were incorporated as previously published latent factors or composite scores, including those for temperament [39], neuropsychological function (e.g., executive function, inhibitory control) [46], and sleep quality [51].
First, to prevent overfitting and validate our models, we split the dataset into “train” (70%) and “test” (30%) sets [52]. To avoid data leakage, all preprocessing steps were performed on the train set, then applied to the test set. This included removing variables with over 30% missing values or zero variance (Supplementary Figure 3); no subjects were removed for exceeding a 30% missing data threshold (Supplementary Figure 4). We handled missing data using multiple imputation with predictive mean matching (PMM). This process was integrated into our bootstrap framework, creating 30 imputed datasets for each of the 1 000 bootstrapped samples. We then trained and compared three classification models: LR, a standard statistical classifier; EN, a regularized linear model that handles multicollinearity and produces sparse results; and RF, a non-linear ensemble of decision trees. We selected the EN as the primary model for identifying significant correlates because it effectively handles multicollinearity in high-dimensional data and yields sparse, interpretable solutions. We identified the most significant correlates from each model using different criteria: pooled 95% CIs for LR, and stability selection with a cutoff threshold of 0.60 for EN and RF [53]. The combined use of bootstrapping and stability selection provides a robust approach for variable selection, particularly with a moderate number of participants and a larger number of potential predictors. Model training for EN and RF involved 5-fold cross-validation. The EN model’s hyperparameters were optimized using a 30×30 tuning grid for the alpha and lambda parameters. To address class imbalance, stratified downsampling was applied to each imputed bootstrap sample before model training for the EN and RF. All analyses were adjusted for age, sex assigned at birth, site, and SES. The test set was used to validate the EN and RF models and calculate prediction performance.
Prediction performances are reported in terms of AUC, overall accuracy (mean, 95% CI, no information rate, and p-value), balanced accuracy, F1 score, sensitivity, specificity, and receiver operating characteristic (ROC) curves (one-vs-rest) for each group. Pairwise post hoc EN models were performed to explore the differences between the NSSI groups, using only the subset of variables found to be stable in the primary EN multiclass model. All analyses were performed in R version 4.5.1. Main packages used are detailed in Supplementary Table 5. We adhered to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines [54].
Results
Psychopathological profiles of NSSI groups and the comparison group
The SOM provided a “psychopathology map” and its distinct regions (profiles) at the second follow-up wave. Two predominant regions can be found within the map, one corresponding to higher psychopathology and the other to lower psychopathology, on the upper left and lower right regions of the map, respectively (Supplementary Figure 2). The k-means clustering algorithm confirmed a two-profile solution with an average silhouette coefficient of 0.31 (Supplementary Figure 2). Supplementary Figure 2 shows the locations of the 244 participants who reported NSSI on the map and their assigned clusters (groups).
Descriptive sociodemographic information for all groups is shown in Table 1. Table 2 depicts the SDQ, SMFQ total scores, and NSSI total scores for each NSSI profile group and the comparison group. The “High Psychopathology Profile group” (Group 1) exhibited higher SDQ and SMFQ scores than the “Low Psychopathology Profile group” (Group 2) and the comparison group. The “Low Psychopathology Profile group” (Group 2) had a lower SDQ total score than Group 1 and the comparison group, but an intermediate SMFQ total score. These differences between the three groups were significant for both SDQ and SMFQ scores (p < 0.001). The High Psychopathology Profile group (Group 1) had markedly more frequent NSSI at FU2 than the Low Psychopathology Profile group (Group 2).
Psychopathology across development for each group
Figure 1 shows the trajectory of SDQ domain scores across groups over time. Detailed statistics are reported in the Supplementary Tables 6–10. At baseline, Group 1 showed significantly higher scores than the comparison group for all SDQ dimensions except for prosocial behaviors. In contrast, Group 2 did not differ significantly from the comparison group on any SDQ dimensions.
A Emotional symptoms, B) Conduct problems, C) Hyperactivity/inattention, D) Peer relationships problems, and E) Prosocial behaviors. Error bars show the 95% CIs.
From the baseline to the first follow-up wave, Group 1 showed significantly different rates of change relative to the comparison group for emotional symptoms and peer relationship problems (i.e., a more moderate decline for the former and an increase in symptom severity for the latter). Conversely, the rate of change for Group 2 relative to the comparison group was not significantly different for all SDQ domains.
From the first to the second follow-up waves, Group 1 showed significantly different rates of change relative to the comparison group for all SDQ domains (i.e., an increase in symptom severity). In contrast, the rate of change for Group 2 relative to the comparison group was significantly different for emotional and conduct problems (i.e., a steeper decline in symptom severity).
Predictors and correlates of psychopathology profile group memberships
The LR did not produce a significant model fit (see Prediction performance). The EN with the determined hyperparameters (alpha = 0.286, lambda = 0.025) and the RF, both using stability selection, had significant model fits (see Prediction performance). Figures 2 and 3 show the significant predictors and concurrent correlates at each of the three waves for each NSSI group for the EN, with predictors also found significant in the RF marked in bold. A positive coefficient median value (β) predicts group membership and a negative value predicts non-group membership. For the High Psychopathology Profile group (Group 1) at baseline, Attention‑deficit/Hyperactivity Disorder (ADHD) (β = 0.18), higher levels of divorce problems reported by the mother (β = 0.09), a primary caregiver with a mood disorder (β = 0.09), relationship problems with the mother (β = 0.07), higher overprotection bonding with the father (β = 0.03); at the first follow-up, higher anxiety-depression symptoms (β = 0.13), higher withdrawn symptoms (β = 0.12), psychostimulants prescribed use (β = 0.09), being frequently bullied (β = 0.07), eating problems (β = 0.05), emotional dysregulation (β = 0.05), being overweight (β = 0.04), fingernail biting (β = 0.04); at the second follow-up, depression (β = 0.21), aggressive behavior (β = 0.18), history of trauma (β = 0.15), mood stabilizers prescribed use (β = 0.07), somatic complaints (β = 0.04) predicted group membership. For the Low Psychopathology Profile group (Group 2) at baseline, inhibitory control (β = 0.13); at first follow-up, school suspension (β = 0.18), obsessive-compulsive disorder (OCD) (β = 0.09), paid work (β = 0.06), attention-deficit/hyperactivity disorder (ADHD) (β = 0.06), part-time school (β = 0.06), perfectionism (β = 0.05); at second follow-up sense of mastery (β = 0.14), cultural activities (β = 0.12), inhibitory control (β = 0.09), studying, in training or working (β = 0.06) predicted group membership. Variables predicting non-group membership for both groups are also shown in Figs. 2 and 3. See Supplementary Figure 5 for the significant predictors of group and non-group membership for the comparison group for the EN, Supplementary Figure 6–8 for the complete significant predictors for the RF, and Supplementary Figure 9 for the post hoc binary classification (pairwise) EN approach.
Bars show median elastic-net coefficients by wave; colors denote predictor domains. Values to the right/left of 0 indicate positive/negative associations with group membership. Variables in bold were also significant in the Random Forest (RF) model.
Bars show median elastic-net coefficients by wave; colors denote predictor domains. Values to the right/left of 0 indicate positive/negative associations with group membership. Variables in bold were also significant in the Random Forest (RF) model.
Prediction performance
Figure 4 shows the macro-averaged Area Under the Curve (AUC) for each model. The LR did not produce overall significant prediction performance, with an AUC of 0.55 [95% CI 0.47–0.61]. The EN and the RF produced significant and similar performance with AUCs of, respectively, 0.72 [95% CI 0.65–0.77] and 0.73 [95% CI 0.67–0.78]. Group 1 showed the highest predictive performance across models, while Group 2 and the comparison group showed similar performance (Supplementary Figure 10). See Supplementary Table 11 for complete performance metrics for the three main models and Supplementary Table 12 and Supplementary Figure 11 for the performance of the pairwise EN.
ROC curves (mean TPR vs. mean FPR) are macro-averaged one-vs-rest estimates across groups on the test set. Solid lines are the macro-averaged ROC curves; shaded ribbons are 95% bootstrap confidence bands; the gray diagonal denotes chance (AUC = 0.5). Legend text reports AUC [95% CI]. Abbrev.: ROC = receiver operating characteristic; AUC = area under the curve; TPR = true positive rate (sensitivity); FPR = false positive rate (1 − specificity); LR = logistic regression; EN = elastic net; RF = random forest.
Discussion
In the current study, we identified distinct psychopathology-based profiles among youth who reported NSSI at follow-up, mapped their trajectories, and then examined the longitudinal predictors of each subgroup. Three main findings emerged. First, we found two distinct groups of youth reporting NSSI, distinguished by high (High Psychopathology Profile group; Group 1) and low (Low Psychopathology Profile group; Group 2) overall psychopathology levels. Second, each profile group exhibited unique symptom trajectories, with the High Psychopathology Profile group showing consistently higher levels of symptom severity than the Low Psychopathology Profile group. Third, distinct predictors and correlates for each profile group indicated different risk factors and concurrent correlates. The main ML supervised models performed well, effectively identifying distinct emotional and behavioral subgroups of youths engaging in NSSI, along with significant predictors associated with each group. However, it was more challenging to differentiate Group 2 from the comparison group, indicated by the weaker performance in the multiclassification model. Additionally, the High Psychopathology Profile group reported a higher mean total NSSI score at FU2 than the Low Psychopathology Profile group, indicating a more repetitive NSSI pattern in this profile. Despite the complexities of our analytical pipelines and datasets, the different profiles and associated factors highlighted in our study align with previous findings from a longitudinal sample in the UK [23]. This supports our approach and underscores the possibility that youth reporting NSSI may present with qualitatively distinct profiles, characterized by varying levels of psychopathology. Importantly, because NSSI was assessed only at FU2, our longitudinal results indicate predictors of profile membership among youth, rather than predictors of the behavior’s initial onset.
The High Psychopathology Profile group (Group 1) is characterized by earlier-emerging and persistent mental health-related symptoms coupled with multiple adversity experiences from childhood to late adolescence. At baseline (ages 6 to 14), this group was associated with numerous factors often considered risk factors for NSSI, including psychopathology (e.g., ADHD) and social adversity, particularly family-related issues, such as parental divorce and relationship conflicts. By the first follow-up (i.e., mean age 13.50), psychiatric symptoms began to emerge, including depression, anxiety, eating disorders symptoms, and emotional dysregulation, alongside continued experiences of adversity, such as bullying. Then, by FU2 (i.e., mean age 18.33) reported NSSI is concurrent with psychiatric illness (i.e., categorical depression), externalizing behavior (i.e., aggressiveness), and traumatic experiences. Overall, the High Psychopathology Profile group (Group 1) experiences an escalating spectrum of psychopathology, with symptoms progressing from externalizing to internalizing symptoms, culminating in a group marked by psychiatric illness, as evidenced by mood and anxiety disorders, aggression, medication use, and impaired psychosocial functioning. This trajectory aligns with a developmental cascade effect wherein cumulative interactions among various factors over time lead to increasing levels of psychopathology [55].
Conversely, the Low Psychopathology Profile group (Group 2) is characterized by lower levels of overall psychopathology and adversity throughout development. Additionally, this group was not associated with as many clear early-life clinical risk factors commonly linked to NSSI, a finding similar to Uh et al.‘s work [23] for their lower psychopathology group. Higher inhibitory control, a component of executive function, and fewer internalizing symptoms (i.e., depression and social withdrawal) observed at baseline suggest a less vulnerable profile from both cognitive and psychiatric perspectives. Inhibitory control, defined as the ability to suppress a dominant response and engage in an effortful regulation [56] may serve as a protective factor for this group by supporting adaptation to new environments, independent problem-solving, and social integration beyond caregiving relationships [57]. Accordingly, The Low Psychopathology Profile group (Group 2) reported fewer sleep problems, family conflicts, exposure to alcohol use, and problems with friends, reflecting a more stable environment from childhood to early adolescence.
Challenges for the Low Psychopathology Profile group (Group 2) appeared to manifest later at the first follow-up, with some participants developing OCD, ADHD, or experiencing psychosocial difficulties (e.g., school suspension), though social and academic functioning remained largely intact. Participants developed less pervasive issues in the second follow-up, as shown by depressive symptoms (but not categorical depression), a slight decline in psychological well-being (e.g., reduced optimism), and NSSI. However, this group maintained a relatively stable academic and psychosocial functioning (e.g., studying, cultural activities, and effectively coping with challenges). Overall, these youths are less burdened with psychopathological symptoms and better equipped with inhibitory control than other self-injuring peers, placing them in a low-adversity, high-resilience group, albeit still susceptible to maladaptive behaviors under stress. The existence of this profile reinforces the importance of the functional model of NSSI [58, 59], showing that these behaviors can emerge as a coping mechanism in response to distress even without severe, pervasive psychopathology. This group may be engaging in NSSI primarily for emotion regulation purposes in the context of adolescent stressors, rather than as a symptom of a major underlying disorder. Clinically, this highlights the need to avoid over-pathologizing the behavior and instead focus on increasing access to interventions/programs for alternative emotion regulation strategies. Notably, considering overall psychopathology, the Low Psychopathology Profile group (Group 2) was, in terms of overall concurrent psychopathology at FU2, less distinct from randomly selected peers without NSSI than the High Psychopathology Profile group, and predictive modeling for NSSI was less successful, even with data from multiple sources. The difficulty of our models to distinguish the Low Psychopathology Profile group from the non-NSSI group indicates that early identification strategies are challenging for this subgroup, at least using the markers from our study.
Our findings regarding predictors of group membership align with existing NSSI literature, which identifies diverse risk factors for NSSI, including mental health symptoms, bullying, trauma, depression, and emotional dysregulation [4, 14]. This is also consistent with other ML studies on self-harm in community-based samples, where mood and emotional symptoms were the strongest predictors of NSSI, followed by general levels of psychopathology [60].
While the current study employed a data-driven approach to identify psychopathology-based profiles among youth reporting NSSI, our findings can be contextualized within existing theoretical models of self-injury. For instance, the High Psychopathology Profile aligns closely with the developmental trajectory proposed by the Integrated Theoretical Model by Nock [11, 12]. This group is characterized by significant distal risk factors identifiable early in life, including persistent social adversity and early psychiatric dimensional symptoms that signal significant vulnerabilities and co-occur with complex psychopathology concurrent to self-injury. It is likely that NSSI emerges to serve self-regulatory purposes to alleviate intense distress stemming from these long-standing issues. On the other hand, the Low Psychopathology Group presents a less clear developmental pathway and does not map cleanly onto existing models. Some vulnerabilities may develop over-time (e.g., emerging ADHD at the FU1) but with a lower burden of adversity and/or psychopathology. Also, when NSSI was reported, depressive symptoms were also evident, which may suggest that this group may have been facing struggles prior to the reported NSSI (e.g., intermediate levels of SMFQ and concurrent optimism were predictors of non-group membership) and may have used NSSI to escape unwanted negative emotions [61]. However, the overall risk configuration does not fit within existing models highlighting the need for more multidimensional approaches that capture nuanced risk profiles and interaction effects over time.
Our findings share similarities with those of Uh et al.’s work [23], such as identifying two psychopathological subgroups (high vs. low) and replicating many predictors, but some notable differences emerged. First, we found no direct evidence of heightened risk-taking characteristics consistently associated with the Low Psychopathology Profile group (Group 2). Second, we did not find reduced support from family and peers in the Low Psychopathology Profile group (Group 2). On the contrary, this group appeared to benefit from a relatively stable environment.
One might argue that our findings reflect underlying general psychopathology groups rather than profiles specific to NSSI. For example, the emergence of clusters differing primarily in symptom severity is a common outcome in profiling analysis [62]. In developmental mental health research, risk factors and symptoms are often non-specific, especially when examined over long-term follow-ups, as early microphenotypes can lead to multiple outcomes [63]. Moreover, NSSI often arises within a broader context of psychopathology and mental distress, serving as a coping mechanism. Therefore, psychopathology, regardless of intensity, can influence these outcomes. However, our multiclass classification model (one-vs-rest) included a comparison group without NSSI, enhancing the specificity to NSSI prediction. Although many predictors may reflect transdiagnostic features, they also appear to have a discriminative value in distinguishing subgroups of individuals who engage in NSSI.
Clinically, these findings highlight the importance of early intervention to prevent the development of trajectories associated with higher psychopathology (i.e., the High Psychopathology Profile group). Individuals in this group would benefit from strategies targeting transdiagnostic features of emerging psychopathology (e.g., emotional dysregulation), precursors of other mental illnesses [7], and visible psychiatric conditions such as ADHD. At the same time, integrating social determinants and a humanizing care stance into NSSI prevention extends beyond clinical interventions and risk assessments. Services should adopt multimodal strategies that span from educational, community, social, and health systems, rather than just focusing on individual symptoms [64]. NSSI represents a maladaptive response to distress, and emphasizing a context-attentive care (i.e., evaluating their life difficulties) and prioritizing the individual’s biographical context and suffering can increase feelings of being understood and reduce barriers to care [64,65,66].
The integration of social determinants is even more important considering that several mechanisms influence the progress from NSSI to more severe forms of self-harm and suicidality. Joiner’s interpersonal theory of suicide suggests that suicide risk is increased by the confluence of experiences of loneliness, perception of being a burden to others, hopelessness and acquired capacity for suicide (e.g., habituation to self-harm) [67]. This process can be mediated by several mechanisms. Deficits in mentalization increase social withdrawal [68] and perception of non-humanness, loss of agency and discrimination can further amplify social isolation [69]. Therefore, treating these patients while keeping in mind their own social context, interpersonal functioning and addressing their social cognition deficits (e.g., mentalizing) can be beneficial. Indeed, several effective early intervention programs aimed at NSSI utilize emotion regulation and mentalizing approaches and emphasize non-pharmacological approaches considering a wider social context [70, 71]. However, our data indicates that mood stabilizer use was associated with this High Psychopathology Profile group, possibly indicating that clinicians are targeting affective instability with psychopharmacology, even though no specific biological treatments exist for NSSI itself, only for comorbidities [4, 72]. This stresses the importance of building mental health services that integrate the social context, not only targeting psychopathology or symptoms [63]. Psychoeducation and support for families dealing with externalizing conditions can help reduce family-related risk factors [73], and school-based anti-bullying interventions can alleviate stressors for already vulnerable youth [74]. Delaying intervention until full threshold syndromes develop misses a critical window in adolescent development, taking a substantial toll on vulnerable youth mental health trajectories and psychosocial functioning [75].
For the trajectory associated with lower psychopathology (i.e., the Low Psychopathology Profile group), targeting earlier predictive factors seems more challenging, as our predictive model shows. This suggests that standard psychopathology screening may be less effective in identifying these youths. However, intervening early remains key, even for youth school-related problems and milder psychopathology. For this group, NSSI may signal that the emotional and stressful demands they face are exceeding their coping capacities and available support. This group might have the cognitive and social capacities to respond to a short-term psychosocial intervention if treated earlier [76] and to generalist treatments that integrate common features of interventions [65]. One barrier to treatment is that many youths, especially in this group, may prefer informal support from social networks or avoid seeking help due to stigma or feeling they are not “ill enough” [77, 78]. This perception may be reinforced by systems that fail to recognize the low syndromic specificity of teenage mental health issues. Therefore, mental health literacy initiatives could be particularly beneficial for this group. While these findings may not immediately translate into novel, NSSI-specific screening instruments, they stress the diverse developmental trajectories to NSSI and can help tailor interventions for different subgroups.
The application of ML was important in this study. The Elastic Net allowed for the simultaneous consideration of variables from longitudinal data, effectively performing robust variable selection and managing multicollinearity. Furthermore, the comparable, and in some cases slightly improved, predictive performance of the Random Forest model (AUCs reported in Results and Supplementary Figure 6–8) suggests some non-linear interactions between predictors might also contribute to group differentiation. The use of complementary ML algorithms helps in detecting the potential complex, non-linear relationships in the pathways to NSSI. It is also important to consider the inherent challenges in predicting complex human behaviors such as NSSI. Our goal of applying ML in this context is not deterministic prediction of complex behaviors, but rather to identify variables that robustly discriminate between groups and evaluate their predictive ability in the context of several other variables [79].
This study has several strengths, including the use of data from a school-based community cohort in a middle-income country with low attrition rates, multimethod assessments (dimensional scales, diagnostic interviews, and neuropsychological testing), and a structured evaluation of the outcome (NSSI), which is rarely available in large cohorts. However, several limitations should be acknowledged. First, a significant limitation is that NSSI was only assessed at FU2. Consequently, we cannot determine the true timing of NSSI onset, its duration, or patterns of chronicity (e.g., distinguishing brief episodes from persistent NSSI). This restricts our ability to model developmental pathways to the initial emergence of NSSI and means our profiles reflect psychopathology among those reporting recent NSSI at FU2, rather than distinct NSSI onset trajectories. This also means we may have excluded participants with non-repetitive self-injury that occurred and remitted before this 6-month window or those whose NSSI occurred only at earlier ages. Second, we lacked data on the NSSI specific functions or granular measures of emotion regulation strategies, which limits a deeper understanding of the mechanisms maintaining NSSI in these profiles. Third, the “enriched risk” design of the Brazilian High-Risk Cohort Study, which oversampled for child and family psychopathology, means that our sample is not representative of a general community or population-based sample. While this design is advantageous for studying less common outcomes and risk factors, it may limit the generalizability of findings related to the prevalence of NSSI or specific psychopathology levels. However, the internal associations identified between predictors and profile membership are likely to be less affected by this sampling strategy. Furthermore, the data come from two large metropolitan areas in Brazil, which may limit generalizability to less urbanized regions of Brazil and other low- or middle-income countries. Fourth, limitations apply to the analytical approach. The ML pipeline employed a highly conservative statistical approach (stability selection) that may have minimized weaker predictors; therefore, it is likely that other predictors might be relevant; the sample sizes for the individual profile groups within the predictive modeling framework are moderate, which may limit power particularly for the Low Psychopathology Profile. Also, while the models were rigorously internally validated, they have not yet been externally validated in an independent cohort, which is necessary to confirm the generalizability of the predictive performance. Fifth, although the study design is longitudinal, the findings remain correlational, and causal relationships cannot be established. Sixth, the identified profiles are data-driven based on the specific inputs used for clustering (SDQ and SMFQ). Although this choice facilitated replication of prior work, it is possible that including different input variables (e.g., personality traits, cognitive styles) might have yielded alternative profile structures. Finally, our analytic sample represents only 70.4% of the FU2 sample. Although we used imputation techniques for missing data, we did not impute outcome variables.
Future studies should test ML algorithms in different populations (i.e., external validation) to strengthen replication of NSSI prediction. Research on the long-term course of these groups is essential for understanding the potential development of other psychopathologies and suicidal behaviors. Different study designs should examine the mechanisms underlying these NSSI subtypes through mediation and moderator analysis, and models with matched control groups could help clarify distinctions between severely affected individuals with or without NSSI. Furthermore, understanding neurobiological underpinnings using neuroimaging, genetics, and other biomarkers can further illuminate the pathophysiology of NSSI. And finally, future research must focus on the evaluation of prevention and early intervention strategies that encompass higher and lower psychopathologies groups across diverse settings (e.g., educational, community, health-systems).
In conclusion, NSSI is a prevalent and severe public health problem worldwide. It is a marker of mental health distress and a risk factor for future psychopathology and suicide. In this study, we replicated and extended most of the findings from a UK-based study in a Brazilian sample, identifying two distinct psychopathology-based profiles among young people who reported NSSI, each associated with different earlier-life and concurrent factors. These profiles differ in psychopathology levels and associated factors, underscoring the need for tailored early intervention strategies. Such interventions could improve the prognoses for these groups, reducing the likelihood of severe outcomes in adulthood and potentially preventing suicidality.
Data availability
The data dictionary is available at https://osf.io/w3jr4 to download directly. Individual-level data are available upon request to the Brazilian High-Risk Cohort Study research committee, by following the instructions and filling in the research form available at https://osf.io/ktz5h/wiki/home.
References
Muehlenkamp JJ. Self-injurious behavior as a separate clinical syndrome. American Journal of Orthopsychiatry. 2005;75:324–33.
Swannell SV, Martin GE, Page A, Hasking P, St John NJ. Prevalence of nonsuicidal self-injury in nonclinical samples: systematic review, meta-analysis and meta-regression. Suicide & Life Threat Behav. 2014;44:273–303.
Cha CB, Franz PJ, M. Guzmán E, Glenn CR, Kleiman EM, Nock MK. Annual research review: suicide among youth – epidemiology, (potential) etiology, and treatment. Child Psychology Psychiatry. 2018;59:460–82.
Brañas MJAA, Croci MS, Murray GE, Choi-Kain LW. The relationship between self-harm and suicide in adolescents and young adults. Psychiatr Ann. 2022;52:311–7.
Steinhoff A, Cavelti M, Koenig J, Reichl C, Kaess M. Symptom shifting from nonsuicidal self-injury to substance use and borderline personality pathology. JAMA Netw Open. 2024;7:e2444192.
Ghinea D, Edinger A, Parzer P, Koenig J, Resch F, Kaess M. Non-suicidal self-injury disorder as a stand-alone diagnosis in a consecutive help-seeking sample of adolescents. J Affect Disord. 2020;274:1122–5.
Uhlhaas PJ, Davey CG, Mehta UM, Shah J, Torous J, Allen NB, et al. Towards a youth mental health paradigm: a perspective and roadmap. Mol Psychiatry. 2023;28:3171–81.
Wang Y-J, Li X, Ng CH, Xu D-W, Hu S, Yuan T-F. Risk factors for non-suicidal self-injury (NSSI) in adolescents: A meta-analysis. eClinicalMedicine. 2022;46:101350.
Valencia-Agudo F, Burcher GC, Ezpeleta L, Kramer T. Nonsuicidal self-injury in community adolescents: A systematic review of prospective predictors, mediators and moderators. J Adolesc. 2018;65:25–38.
Campbell PD, Proulx JDE, Sollis K, Cruwys T, Calear AL, Rathbone J, et al. Early-life protective factors for adolescent self-harm and suicidality: a longitudinal cohort study in Australia. The Lancet Psychiatry. 2025;12:746–57.
Nock MK. Self-Injury. Annu Rev Clin Psychol. 2010;6:339–63.
Hooley JM, Franklin JC. Why do people hurt themselves? a new conceptual model of nonsuicidal self-injury. Clinical Psychological Science. 2018;6:428–51.
Kaess M, Hooley JM, Klimes-Dougan B, Koenig J, Plener PL, Reichl C, et al. Advancing a temporal framework for understanding the biology of nonsuicidal self- injury: An expert review. Neuroscience & Biobehavioral Reviews. 2021;130:228–39.
Fox KR, Franklin JC, Ribeiro JD, Kleiman EM, Bentley KH, Nock MK. Meta-analysis of risk factors for nonsuicidal self-injury. Clin Psychol Rev. 2015;42:156–67.
Franklin JC, Ribeiro JD, Fox KR, Bentley KH, Kleiman EM, Huang X, et al. Risk factors for suicidal thoughts and behaviors: A meta-analysis of 50 years of research. Psychol Bull. 2017;143:187–232.
Burke TA, Ammerman BA, Jacobucci R. The use of machine learning in the study of suicidal and non-suicidal self-injurious thoughts and behaviors: A systematic review. J Affect Disord. 2019;245:869–84.
Chen ZS, Kulkarni P, Galatzer-Levy IR, Bigio B, Nasca C, Zhang Y. Modern views of machine learning for precision psychiatry. Patterns. 2022;3:100602.
Fox KR, Huang X, Linthicum KP, Wang SB, Franklin JC, Ribeiro JD. Model complexity improves the prediction of nonsuicidal self-injury. J Consult Clin Psychol. 2019;87:684–92.
Ammerman BA, Jacobucci R, McCloskey MS. Using exploratory data mining to identify important correlates of nonsuicidal self-injury frequency. Psychology of Violence. 2018;8:515–25.
Zhong Y, He J, Luo J, Zhao J, Cen Y, Song Y, et al. A machine learning algorithm-based model for predicting the risk of non-suicidal self-injury among adolescents in western China: A multicentre cross-sectional study. J Affect Disord. 2024;345:369–77.
McHugh CM, Ho N, Iorfino F, Crouse JJ, Nichles A, Zmicerevska N, et al. Predictive modelling of deliberate self-harm and suicide attempts in young people accessing primary care: a machine learning analysis of a longitudinal study. Soc Psychiatry Psychiatr Epidemiol. 2023;58:893–905.
Sun T, Liu J, Wang H, Yang BX, Liu Z, Liu J, et al. Risk prediction model for non-suicidal self-injury in chinese adolescents with major depressive disorder based on machine learning. NDT. 2024;20:1539–51.
Uh S, Dalmaijer ES, Siugzdaite R, Ford TJ, Astle DE. Two pathways to self-harm in adolescence. Journal of the American Academy of Child & Adolescent Psychiatry. 2021;60:1491–1500.
Kieling C, Salum GA, Pan PM, Bressan RA. Youth mental health services: the right time for a global reach. World Psychiatry. 2022;21:86–87.
Meehan AJ, Lewis SJ, Fazel S, Fusar-Poli P, Steyerberg EW, Stahl D, et al. Clinical prediction models in psychiatry: a systematic review of two decades of progress and challenges. Mol Psychiatry. 2022;27:2700–8.
Stanford S, Jones MP. Psychological subtyping finds pathological, impulsive, and ‘normal’ groups among adolescents who self-harm. Child Psychology Psychiatry. 2009;50:807–15.
Gao Q, Guo J, Wu H, Huang J, Wu N, You J. Different profiles with multiple risk factors of nonsuicidal self-injury and their transitions during adolescence: A person-centered analysis. J Affect Disord. 2021;295:63–71.
Taylor PJ, Jomar K, Dhingra K, Forrester R, Shahmalak U, Dickson JM. A meta-analysis of the prevalence of different functions of non-suicidal self-injury. J Affect Disord. 2018;227:759–69.
Salum GA, Gadelha A, Pan PM, Moriyama TS, Graeff-Martins AS, Tamanaha AC, et al. High risk cohort study for psychiatric disorders in childhood: rationale, design, methods and preliminary results. Int J Methods Psych Res. 2015;24:58–73.
ABEP. Critério de Classificação Econômica Brasil. Associação Brasiliera de Empresas de Pesquisa. 2010.
Bjärehed J, Lundh L. Deliberate self-harm in 14-year-old adolescents: how frequent is it, and how is it associated with psychopathology, relationship variables, and styles of emotional regulation?. Cogn Behav Ther. 2008;37:26–37.
Goodman R, Ford T, Simmons H, Gatward R, Meltzer H. Using the strengths and difficulties questionnaire (SDQ) to screen for child psychiatric disorders in a community sample. Br J Psychiatry. 2000;177:534–9.
Goodman A, Heiervang E, Fleitlich-Bilyk B, Alyahri A, Patel V, Mullick MSI, et al. Cross-national differences in questionnaires do not necessarily reflect comparable differences in disorder prevalence. Soc Psychiatry Psychiatr Epidemiol. 2012;47:1321–31.
Romani-Sponchiado A, Vidal-Ribas P, Bressan RA, De Jesus Mari J, Miguel EC, Gadelha A, et al. Longitudinal associations between positive attributes and psychopathology and their interactive effects on educational outcomes. Eur Child Adolesc Psychiatry. 2023;32:463–74.
Messer S, Angold A, Costello EJ, Loeber R, Van Kammen W, Stouthamer-Loeber M. Development of a short questionnaire for use in epidemiological studies of depression in children and adolescents: factor composition and structure across development. Int J Methods Psychiatr Res. 1995;5:251–62.
Jobim GDS, Amaral JVD, Pacheco JPG, Gadelha A, Miguel EC, Bressan RA, et al. Clinical properties of the short mood and feelings questionnaire: development of a free calculator based on the Brazilian high-risk cohort study. J Psychiatr Res. 2025;190:457–64.
Fleitlich-Bilyk B, Goodman R. Prevalence of child and adolescent psychiatric disorders in southeast Brazil. Journal of the American Academy of Child & Adolescent Psychiatry. 2004;43:727–34.
Goodman R, Ford T, Richards H, Gatward R, Meltzer H. The development and well-being assessment: description and initial validation of an integrated assessment of child and adolescent psychopathology. J Child Psychol Psychiatry. 2000;41:645–55.
Hoffmann MS, Pan PM, Manfro GG, De Jesus Mari J, Miguel EC, Bressan RA, et al. Cross-sectional and longitudinal associations of temperament and mental disorders in youth. Child Psychiatry Hum Dev. 2019;50:374–83.
Bordin IA, Rocha MM, Paula CS, Teixeira MCTV, Achenbach TM, Rescorla LA, et al. Child behavior checklist (CBCL),youth self-report (YSR) and teacher’s report form(TRF): an overview of the development of the original and Brazilian versions. Cad Saude Publica. 2013;29:13–28.
Achenbach TM, Ivanova MY, Rescorla LA. Empirically based assessment and taxonomy of psychopathology for ages 1½–90+ years: developmental, multi-informant, and multicultural findings. Compr Psychiatry. 2017;79:4–18.
Freichel R, Epskamp S, De Jong PJ, Cousijn J, Franken I, Salum GA, et al. Investigating risk factor and consequence accounts of executive functioning impairments in psychopathology: an 8-year study of at-risk individuals in Brazil. Psychol Med. 2025;55:e192.
Figueiredo VLM, Pinheiro S, Nascimento ED. Teste de inteligência WISC-III adaptando para a população brasileira. Psicol Esc Educ. 1998;2:101–7.
Tellegen A, Briggs PF. Old wine in new skins: Grouping Wechsler subtests into new scales. J Consult Psychol. 1967;31:499–506.
Nascimento ED, Figueiredo VLMD. WISC-III e WAIS-III: alterações nas versões originais americanas decorrentes das adaptações para uso no Brasil. Psicol Reflex Crit. 2002;15:603–12.
Martel MM, Pan PM, Hoffmann MS, Gadelha A, Do Rosário MC, Mari JJ, et al. A general psychopathology factor (P factor) in children: Structural model analysis and external validation through familial risk and child global executive function. J Abnorm Psychol. 2017;126:137–48.
Amorim P. Mini International neuropsychiatric interview (MINI): validação de entrevista breve para diagnóstico de transtornos mentais. Rev Bras Psiquiatr. 2000;22:106–15.
Sugiyama A, Kotani M. Analysis of gene expression data by using self-organizing maps and k-means clustering. Proceedings of the 2002 International Joint Conference on Neural Networks. 2002;2:1342–5. IJCNN’02 (Cat. No.02CH37290).
Kohonen T. The self-organizing map. Proc IEEE. 1990;78:1464–80.
Schomaker M, Heumann C. Bootstrap inference when using multiple imputation. Stat Med. 2018;37:2252–66.
Becker SP, Ramsey RR, Byars KC. Convergent validity of the Child Behavior Checklist sleep items with validated sleep measures and sleep disorder diagnoses in children and adolescents referred to a sleep disorders center. Sleep Med. 2015;16:79–86.
Gholamy A, Kreinovich V, Kosheleva O Why 70/30 or 80/20 relation between training and testing sets: a pedagogical explanation. Departmental Technical Reports (CS). 2018. 2018.
Meinshausen N, Bühlmann P. Stability selection. Journal of the Royal Statistical Society Series B (Statistical Methodology). 2010;72:417–73.
Collins GS, Reitsma JB, Altman DG, Moons K. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMC Med. 2015;13:1.
Masten AS, Cicchetti D. Developmental cascades. Dev Psychopathol. 2010;22:491–5.
Nigg JT. Annual research review: On the relations among self-regulation, self-control, executive functioning, effortful control, cognitive control, impulsivity, risk-taking, and inhibition for developmental psychopathology. Child Psychology Psychiatry. 2017;58:361–83.
Romer AL, Pizzagalli DA. Is executive dysfunction a risk marker or consequence of psychopathology? A test of executive function as a prospective predictor and outcome of general psychopathology in the adolescent brain cognitive development study®. Developmental Cognitive Neuroscience. 2021;51:100994.
Nock MK, Prinstein MJ. A functional approach to the assessment of self-mutilative behavior. J Consult Clin Psychol. 2004;72:885–90.
Lloyd-Richardson EE, Perrine N, Dierker L, Kelley ML. Characteristics and functions of non-suicidal self-injury in a community sample of adolescents. Psychol Med. 2007;37:1183–92.
Su R, John JR, Lin P-I. Machine learning-based prediction for self-harm and suicide attempts in adolescents. Psychiatry Res. 2023;328:115446.
Chapman AL, Gratz KL, Brown MZ. Solving the puzzle of deliberate self-harm: The experiential avoidance model. Behav Res Ther. 2006;44:371–94.
Sher KJ, Jackson KM, Steinley D. Alcohol use trajectories and the ubiquitous cat’s cradle: cause for concern?. J Abnorm Psychol. 2011;120:322–35.
McGorry PD, Mei C, Dalal N, Alvarez-Jimenez M, Blakemore S-J, Browne V, et al. The lancet psychiatry commission on youth mental health. The Lancet Psychiatry. 2024;11:731–74.
Al-Halabí S, Fonseca-Pedrero E. Editorial for special issue on understanding and prevention of suicidal behavior: humanizing care and integrating social determinants. Psicothema. 2024;36:309–18.
Gunderson JG, Choi-Kain LW. Working with patients who self-injure. JAMA Psychiatry. 2019;76:976.
Tickell A, Fonagy P, Hajdú K, Obradović S, Pilling S. ‘Am I really the priority here?’: help-seeking experiences of university students who self-harmed. BJPsych Open. 2024;10:e40.
Joiner TE, Jeon ME, Lieberman A, Janakiraman R, Duffy ME, Gai AR, et al. On prediction, refutation, and explanatory reach: a consideration of the interpersonal theory of suicidal behavior. Prev Med. 2021;152:106453.
Andreo-Jover J, Fernández-Jiménez E, Bobes J, Isabel Cebria A, Crespo-Facorro B, De La Torre-Luque A, et al. Suicidal behavior and social cognition: the role of hypomentalizing and fearlessness about death. Psicothema. 2024;36:403–13.
Robison M, Udupa NS, Rice TB, Wilson-Lemoine E, Joiner TE, Rogers ML. The interpersonal theory of suicide: state of the science. Behav Ther. 2024;55:1158–71.
Stockings E, Hall WD, Lynskey M, Morley KI, Reavley N, Strang J, et al. Prevention, early intervention, harm reduction, and treatment of substance use in young people. The Lancet Psychiatry. 2016;3:280–96.
Chanen AM, Thompson KN. Early intervention for personality disorder. Current Opinion in Psychology. 2018;21:132–5.
Plener PL Medical and pharmaceutical interventions in NSSI. In: Lloyd-Richardson EE, Baetens I, Whitlock JL, editors. The Oxford Handbook of Nonsuicidal Self-Injury, 1st ednOxford University Press; 2023.
Claussen AH, Holbrook JR, Hutchins HJ, Robinson LR, Bloomfield J, Meng L, et al. All in the family? a systematic review and meta-analysis of parenting and family environment as risk factors for attention-deficit/hyperactivity disorder (ADHD) in children. Prev Sci. 2022. 2022. https://doi.org/10.1007/s11121-022-01358-4.
Fraguas D, Díaz-Caneja CM, Ayora M, Durán-Cutilla M, Abregú-Crespo R, Ezquiaga-Bravo I, et al. Assessment of school anti-bullying interventions: a meta-analysis of randomized clinical trials. JAMA Pediatr. 2021;175:44.
Sharp C, Wall K. Personality pathology grows up: adolescence as a sensitive period. Current Opinion in Psychology. 2018;21:111–6.
Kaess M, Edinger A, Fischer-Waldschmidt G, Parzer P, Brunner R, Resch F. Effectiveness of a brief psychotherapeutic intervention compared with treatment as usual for adolescent nonsuicidal self-injury: a single-centre, randomised controlled trial. Eur Child Adolesc Psychiatry. 2020;29:881–91.
Rickwood DJ, Mazzer KR, Telford NR. Social influences on seeking help from mental health services, in-person and online, during adolescence and young adulthood. BMC Psychiatry. 2015;15:40.
Sheikhan NY, Henderson JL, Halsall T, Daley M, Brownell S, Shah J, et al. Stigma as a barrier to early intervention among youth seeking mental health services in Ontario, Canada: a qualitative study. BMC Health Serv Res. 2023;23:86.
Leist AK, Klee M, Kim JH, Rehkopf DH, Bordas SPA, Muniz-Terrera G, et al. Mapping of machine learning approaches for description, prediction, and causal inference in the social and health sciences. Sci Adv. 2022;8:eabk1942.
Acknowledgements
This research was funded by the National Institute of Developmental Psychiatry for Children and Adolescents (INDP), a science and technology institute funded by the National Council for Scientific and Technological Development (CNPq, Conselho Nacional de Desenvolvimento Científico e Tecnológico), grant number 465550/2014-2, by the Research Support Foundation of the State of São Paulo (FAPESP, Fundação de Amparo à Pesquisa do Estado de São Paulo), grant number 2014/50917-0 and 2021/05332-8, by the National Center for Research and Innovation in Mental Health (CISM), funded by FAPESP, grant number 2021/12901, and Banco Industrial do Brasil S/A and by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. Drs. Croci and Brañas are Ph.D. students at USP and are supported by CAPES. Drs. Ren and Choi-Kain are supported by the Gunderson Legacy Fund.
Author information
Authors and Affiliations
Contributions
MSC and MJAAB, contributed equally to this work; MSC, MJAAB, and LWC-K. conceived and designed the study. MSC, MJAAB, and PMP curated the data. MSC, MJAAB, EFF, BR, SU, ESD, and LWC-K performed the formal analyses. AC, GS, LAPR, ECM, and PMP acquired funding and provided resources. MSC, MJAAB, EFF, BR, and LWC-K contributed to the investigation. MSC, MJAAB, EFF, BR, SU, ESD, AC, and LWC-K contributed to methodology. MSC, MJAAB, ECM, PMP, and LWC-K managed project administration. MSC, MJAAB, EFF, and BR developed software and analytic code. ECM, PMP, and LWC-K supervised the project. MSC, MJAAB, EFF, BR, SU, ESD, AC, GS, LAPR, ECM, and LWC-K contributed to validation of the results. MSC, MJAAB, and LWC-K created data visualizations. MSC, MJAAB, and LWC-K wrote the original draft. All authors critically revised the manuscript for important intellectual content and approved the final version for submission.
Corresponding author
Ethics declarations
Competing interests
Drs. Croci and Brañas have received authorship royalties from Manole. Dr. Caye acted as a consultant to Knight Therapeutics. Dr. Rohde has received grant or research support from, served as a consultant to, and served on the speakers’ bureau of Abdi Ibrahim, Abbott, Aché, Adium, Apsen, Bial, Cellera, EMS, Hypera Pharma, Knight Therapeutics, Libbs, Medice, Novartis/Sandoz, Pfizer/Upjohn/Viatris, Shire/Takeda, and Torrent in the last three years. The ADHD and Juvenile Bipolar Disorder Outpatient Programs chaired by Dr Rohde have received unrestricted educational and research support from the following pharmaceutical companies in the last three years: Novartis/Sandoz and Shire/Takeda. Dr Rohde has received authorship royalties from Oxford Press and ArtMed. Dr. Pan received payment or honoraria for lectures and presentations in educational events for Sandoz, Daiichi Sankyo, Eurofarma, and Abbot. Choi-Kain receives royalties from the American Psychiatric Association and Springer, serves as a consultant for Tetricus Labs, and has served as a consultant for Boehringer Ingelheim. Other authors declare no conflicts of interest.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
41398_2026_3832_MOESM1_ESM.docx (download DOCX )
Supplementary Information - Psychopathology Profiles and Longitudinal Correlates of Nonsuicidal Self-Injury in Youth: A Machine-Learning Approach
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Croci, M.S., Brañas, M.J., Finch, E.F. et al. Psychopathology profiles and longitudinal correlates of nonsuicidal self-injury in youth: a machine-learning approach. Transl Psychiatry 16, 99 (2026). https://doi.org/10.1038/s41398-026-03832-x
Received:
Revised:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41398-026-03832-x






