Abstract
Borderline Personality Disorder (BPD) is a severe mental disorder marked by emotional dysregulation. Estimates show that 73% of patients with BPD will have, on average, three suicide attempts in their lifetime, with up to 10% of cases resulting in death. Reliable tools to identify risk factors associated with suicide are lacking. Artificial Intelligence (AI) could fill this gap, supporting the development of effective intervention strategies. This pilot study provides preliminary evidence that a multimodal signature could differentiate suicide attempts in individuals with BPD, paving the way to prospective cohort validation and clinical applications. We developed DRAMA-BPD (Detecting Retrospective suicide Attempts with Machine learning Approaches in Borderline Personality Disorder), an explainable, multimodal, Machine Learning (ML) model based on an ensemble classifier of lifetime suicide attempters among people with BPD. DRAMA-BPD was trained on the sociodemographic, clinical, and MRI data of 104 individuals with BPD recruited from two cohorts. Processing techniques adopted included feature extraction. SHapley Additive exPlanations (SHAP) was used to assess model interpretability. DRAMA-BPD achieved a balanced accuracy of 0.68, sensitivity of 0.58, specificity of 0.77, and AUC of 0.68. SHAP analysis identified cortical volumes and thickness from T1-weighted images and Symptoms Checklist 90 Revised (SCL-90-R) as the main contributors to classification.
Similar content being viewed by others
Introduction
Borderline Personality Disorder (BPD), a severe mental disorder marked by emotional dysregulation and affect instability, has a prevalence ranging from 0.7% to 2.7% in the general population, with higher occurrence in women1,2. It has been estimated that 73% of patients with BPD will have approximately 3 suicide attempts in their lifetime, with up to 10% of cases resulting in death3,4,5,6. Key risk factors for suicidal behaviors in BPD include impulsivity, self-harm, depressive symptoms, and emotional dysregulation3,7,8. While suicide prevention is one of the most important challenges in psychiatric clinical practice, in the field of BPD this aspect is particularly relevant. Indeed, BPD patients often show chronic suicidal ideation that varies in intensity over time, sometimes in association with stressful life events. Suicidal thoughts are very frequent and are not useful in predicting suicidal actions4. Given the complex interplay between these factors, identifying those patients who might be more prone to suicidal attempts remains a crucial challenge for prevention and effective intervention9. Ribeiro et al.10 found that prior self-harm and suicide attempts increase the risk of later attempts, but contribute only marginally to diagnostic accuracy above chance. For this reason, developing tools that can identify signatures of suicide attempt from multimodal data while providing feature-level explanations may represent an important initial step toward clinically useful, validated risk-assessment instruments.
Machine Learning (ML) algorithms for identifying suicide attempters (SAs) versus non-suicide attempters (NAs) have been proposed. A review by Pigoni et al.11 showed that most ML models achieved an accuracy of 0.7 or higher. Fortaner-Uyà et al.12 developed a model to predict suicide attempts and tested it in a longitudinal study, reaching an accuracy of 0.64 and AUROC of 0.7. However, several methodological issues may affect generalizability of the ML models. One major limitation is overfitting, occurring when a model is excessively tailored to the training dataset, reducing its ability to perform well on unseen data. As an example, the study by Horvath et al.13 used a high number of input features (29 features on 353 records), a factor that may have limited the model’s reliability14. Another issue is the lack of independent validation, essential to assess a model generalizability beyond the initial training dataset. Many studies fail to incorporate an independent test set, making it difficult to evaluate real-world applicability15,16. Additionally, some studies suffer from class imbalance, meaning that the dataset contains significantly more samples from one class than another, which may introduce biases in performance evaluation17. For instance, the models developed by Su et al.18 and Iorfino et al.19 rely on unbalanced datasets, and while some strategies can be used to mitigate this issue, model performance can still be affected. The aforementioned references show that most studies do not adopt rigorous validation approaches, raising questions on the overall applicability of their findings.
Given the lack of a validated BPD-specific tool to identify individuals with a history of suicide attempt, this pilot study tests the hypothesis that a multimodal signature can distinguish lifetime suicide attempters among people with BPD. To explore this idea and its methodological feasibility as a proof-of-concept (POC) for a larger study, we developed DRAMA-BPD (Detecting Retrospective suicide Attempts with Machine learning Approaches in Borderline Personality Disorder), a multimodal, eXplainable Artificial Intelligence (XAI) tool built on a classifier for lifetime suicide attempts among persons with BPD. DRAMA-BPD, trained on the sociodemographic, clinical, and Magnetic Resonance Imaging (MRI) data of 104 individuals with BPD recruited from two cohorts, was designed to overcome most of the limitations previously mentioned, specifically:
-
Class imbalance is avoided since our dataset is natively balanced with 47/104 SAs (45%) and 57/104 NAs (55%);
-
Overfitting was mitigated by reducing the number of features by means of an extraction process;
-
The lack of external validation was not resolved, as we were not able to find an independent compatible dataset. We highlight this as one of the limitations of our study.
While individual techniques are established, their integration represents a methodologically rigorous framework designed to overcome prior limitations and establish feasibility for prospective validation studies. Our approach also aims to evaluate the feasibility of analytic procedures not widely applied in this literature (e.g., MRI harmonization, feature extraction, and ensemble modelling). SHapley Additive exPlanations (SHAP) analysis20 was used to interpret feature contributions, making DRAMA-BPD explainable. In light of these considerations, the primary goals of this work are:
-
Evaluate the feasibility of deriving an interpretable multimodal signature associated with past suicide attempts;
-
Generate preliminary performance estimates to motivate prospective external validation.
Methods
Study population
DRAMA-BPD was trained on two cross-sectional samples deriving from previous studies. The CLIMAMITHE study was a multicenter randomized clinical trial (NCT02370316) conducted at two Italian centers, aiming to assess clinical and neurobiological effects of Metacognitive Interpersonal Therapy compared with Structured Clinical Management on 60 individuals with BPD21,22. Participants included adults aged 18–45 years who met the Diagnostic and Statistical Manual of Mental Disorders, version IV (DSM-IV)23 criteria for BPD, as DSM-5-based version of the instruments were not yet implemented at the time the research protocol was developed. However, the diagnostic criteria for BPD remained virtually unchanged between DSM-IV and DSM-5, ensuring full conceptual and operational comparability24. For the purpose of this study, they were divided based on the presence of lifetime suicide attempts recorded in the anamnestic interview and the Structured Clinical Interview for DSM-IV for Personality Disorder (SCID II) (30 SAs and 30 NAs). The CLIMAMITHE study also includes MRI data, consisting of T1-weighted 3D (T13D), Diffusion Tensor Imaging (DTI), and Fluid-Attenuated Inversion Recovery (FLAIR) sequences acquired using a Siemens Skyra 3 T scanner at the Hospital Spedali Civili of Brescia. The CLIMAMITHE protocol was approved by the ethical committee of the coordinating center (Protocol number 67/2014), and informed consent was obtained from all subjects and/or their legal guardian(s).
The SUDMEX_CONN dataset is a case–control study of individuals with cocaine use disorder (ethical number CEI/C/061/2013), and comprises 145 participants who underwent extensive neuropsychiatric assessments25,26,27. Additionally, MRI sequences were acquired, including T13D, 10-min resting-state fMRI, and High Angular Resolution Diffusion Imaging—Diffusion-Weighted Imaging Multishell, acquired with a Philips Ingenia, 3 T scanner at the “National Institute of Psychiatry” in Mexico City. The SCID II scale was used to diagnose BPD. The C9 item from the Mini-International Neuropsychiatric Interview28 scale was used to identify SAs. Participants with available MRI data were then selected, resulting in a sample of 44 individuals (17 SAs and 27 NAs). The study was carried out according to the Declaration of Helsinki and was approved by the Ethics Committee of the Instituto Nacional de Psiquiatría “Ramón de la Fuente Muñiz”.
A psychiatrist (AM) and a psychologist (SM), both with extensive experience in the assessment and treatment of individuals with BPD, evaluated the compatibility of the two datasets. We then merged them to increase sample size and statistical power of our exploratory estimates. As the datasets contained different feature sets, only overlapping ones were retained to ensure consistency and minimize biases introduced by missing data imputation. For this reason, only T13D and DTI were selected among MRI features. To mitigate the lack of an independent test set and reduce overfitting risk, we employed well-established methods for performance estimation in small datasets (see Classifier pipeline for further details). Merging two populations also introduces heterogeneity in the features that, after harmonization and control for site effects, can improve the generalizability of multimodal signatures.
Sociodemographic and clinical data
Sociodemographic and clinical data are presented in Table 1. Specifically, the following clinical evaluations were available:
-
Difficulties in emotion regulation scale (DERS)29,30, a self-administered questionnaire used to assess changes in emotion regulation;
-
Barratt impulsiveness scale (BIS)31, a 30-item self-administered questionnaire designed to assess impulsivity;
-
Symptoms checklist 90 revised (SCL-90-R)32, which assesses general psychopathology. This self-administered inventory comprises 90 items across nine symptom dimensions (somatization, obsessive–compulsive tendencies, interpersonal sensitivity, depression, anxiety, hostility, phobic anxiety, paranoid ideation, and psychoticism);
-
Structured clinical interview for DSM (SCID), a semi-structured interview to assess major DSM diagnoses. Both datasets used SCID II based on DSM-IV33. Specifically, we used all 9 diagnostic criteria for BPD, excluding criterion number 5 (Recurrent suicidal behavior, gestures, or threats, or self-mutilating behavior) because of the high correlation with the dependent variable of the classifier.
MRI data
Table 2 reports details about the acquisition equipment for the datasets. MRI scans were processed using standard pipelines to extract structural and connectivity biomarkers, including White Matter hypointensity (WM-hypo) volumes, indicative of brain lesions, from T13D images. Specifically: Freesurfer (v7.3.2) was employed to compute cortical thickness and subcortical volumes based on the Desikan-Killiany atlas34, while TRActs Constrained by UnderLying Anatomy (TRACULA, v7.3.2) provided Mean Diffusivity and Fractional Anisotropy measures for major white-matter tracts.
T1-weighted 3D MRI scans were corrected for smooth intensity variations using N4BiasFieldCorrection from Advanced Normalization Tools (ANTs)35 and pre-registered to the MNI_152_T1_1mm template using the FMRIB Software Library (FSL) flirt function with 12 Degrees of Freedom (DOF)36,37,38,39. Then, recon-all function from Freesurfer36,37 was used to extract cortical thicknesses and subcortical volumes of regions defined according to the Desikan-Killiany atlas. The segmentThalamicNuclei function40 extracted thalamic nuclei subunit volumes, and the segmentHA_T1 function40,41 extracted hippocampus and amygdala subfields volumes.
The pre-processing of DTI data involved initial noise removal using the dwidenoise function, followed by the elimination of Gibbs artifacts using mridegibbs function, both tools from the MRtrix3 package42. Subsequently, dwifslpreproc (from MRtrix3) function was applied to estimate and correct susceptibility-induced distortions and to remove potential eddy current artifacts. Finally, DTI images were corrected for smooth intensity variations using dwibiascorrect function and registered to the JHU-ICBM-DWI-2 mm template using flirt with 12 DOF44. Because some original DTI images from SUDMEX_CONN exhibited low quality that could affect the registration to the MNI template, we introduced a manual 9 DOF registration of the DTI to the JHU-ICBM-DWI-2 mm template between the eddy current removal and the proper flirt registration. TRACULA was then used for reconstructing a set of 42 WM pathways from DTI images, allowing the extraction of Fractional Anisotropy and Mean Diffusivity of WM pathways43,44. Image processing pipelines were uniformly applied to both CLIMAMITHE and SUDMEX_CONN datasets, as detailed in Supplemental Information. MRI-derived data underwent quality control (QC) by three expert neuroscientists (AB, SDF, AR) and poor-quality scans or segmentation errors were excluded (2 DTI subject data were discarded). Neuroimaging data were harmonized using the NeuroHarmonize model45, with CLIMAMITHE set as the reference due to the elevated prevalence of cocaine use in SUDMEX_CONN (~ 80%). Given that cocaine abuse among BPD individuals typically ranges from 18 to 34%46,47, and CLIMAMITHE showed a cocaine abuse prevalence of 25%, was selected as more representative of the general BPD population. Moreover, since this feature differed significantly between the two cohorts (p-value < 0.05), we included the correction of this factor in the harmonization process.
Classifier pipeline
The combined dataset included 104 individuals (47 SAs, 57 NAs) that passed QC and 345 candidate features. To prevent information leakage, preprocessing, feature extraction, and hyperparameter tuning were carried out exclusively within the training folds of a stratified ten-fold cross-validation, as it offers a favorable balance between bias and variance in performance estimation, and then applied to the corresponding test folds. Preprocessing included encoding of categorical variables, K-Nearest-Neighbour imputation, and min–max scaling to the [0,1] range. Feature extraction, essential to reduce overfitting risk, was performed in two stages. First, we identified the most relevant features using Random Forest (RF) ranking48 (number of estimators = 1000, measure used = Mean Decrease in Impurity), keeping the 50 features with the highest score. Then, we removed highly correlated variables using the Variance Inflation Factor (VIF)49 (threshold = 5), with an iterative removal, recomputed after each drop. To assess the stability of the process, feature extraction was repeated across folds50; the ten features most consistently selected across folds were retained. As our objective was to preserve feature interpretability, we avoided feature extraction methods that combine original variables into latent components, because they prevent direct attribution of model classifications to specific features; one example is Principal Component Analysis (PCA)51, which only supports partial interpretability. DRAMA-BPD classifier was implemented by testing several ML models: (1) Support Vector Machine (SVM)52, Random Forest, Naive Bayes (NB)53, XGBoost54, LightGBM55, CatBoost56, and Regularized Logistic Regression57, all tested as individual models; (2) an ensemble of SVM, RF, and NB; and (3) an ensemble of XGBoost, LightGBM, and CatBoost. All these models have been used previously in similar contexts: SVM in58,59, RF in12,17,18, NB in58,60, XGBoost in61, LightGBM in62, CatBoost in63, and regularized logistic regression in64. We then selected the best-performing model, meaning the one with highest statistical outcomes, which was the ensemble of SVM, RF, and NB. While these models represent established techniques in psychiatric classification tasks, their integration here is specifically calibrated for BPD-suicide classification with interpretability prioritization. This combination addresses the known gap in BPD-specific ML literature, which has been dominated by single-modality approaches. No covariates were applied in the training process of the models. After training, we applied the SHAP toolkit post-hoc to evaluate features’ relevance and the reciprocal effects in the classification process, with the goal of understanding the model’s decisions and overcoming the current ML black-box approach. SHAP values were computed on held-out test sets and aggregated across splits. Finally, we performed a power analysis by means of Nx Subsampling65 to estimate the sample size required to achieve the target accuracy.
Statistical analysis
Statistical comparisons between variables were performed via the Kruskal–Wallis test, a non-parametric statistical method suitable for cases where the assumptions of normality and homogeneity of variances are not met. The significance level for statistical comparisons was set at 0.0566,67. The code used in this study was implemented in Python. Specifically, statistical analyses were conducted with statsmodels (0.13.5) and scipy (1.7.3), ML models were implemented with scikit-learn (1.3.0), and SHAP analysis was performed using the shap library (0.41.0).
Results
Sociodemographic and clinical features of the sample
Sociodemographic and clinical features of the dataset are shown in Table 1. Statistically significant differences (p < 0.05) emerged for several variables. SAs reported a higher number of lifetime acute ward admissions, were more likely to have experienced at least one acute psychiatric hospitalization, engaged more frequently in self-harming behaviors, and more commonly endorsed a sense of emptiness, a core BPD criterion.
Classifier pipeline
We initially aimed to reduce the number of features to ten, but results showed that using fewer features led to similar performance. Therefore, we opted for a total of four features, for the sake of simplicity. The most relevant features were related to MRI and clinical data, while sociodemographic and DTI-derived features did not contribute to the model:
-
Thickness of the right hemisphere rostral anterior cingulate (RH_rACC)
-
Volume of the left hemisphere presubiculum (LH_PRS)
-
Volume of white matter hypointensity (WM-hypo)
-
Symptoms checklist 90 revised (SCL-90-R)
Table 3 shows data related to the relevant features of the two groups.
Averages and standard deviations of the most relevant features of Suicide Attempters and Non-Attempters involved in this study. Values in brackets indicate the number of participants for whom the data were available. An asterisk shows features that present a significant p-value (< 0.05). Acronyms: RH_rACC, Right Hemisphere Rostral Anterior Cingulate; LH_PRS, Left Hemisphere Presubiculum; SCL-90-R, Symptoms Check-list 90; WM-hypo, White Matter hypointensities.
The best-performing ensemble classifier yielded the following results over ten-fold cross-validation: accuracy = 0.67 (95% CI 0.63—0.71), sensitivity = 0.58 (95% CI 0.53—0.63), specificity = 0.77 (95% CI 0.67—0.86), and Area Under the ROC Curve 0.68 (95% CI 0.63—0.72). The balanced accuracy is 0.68 (95% CI 0.63—0.72). Performance metrics for each class are as follows:
-
Class 1 (SA): sensitivity = 0.58 (95% CI 0.53–0.63), specificity = 0.77 (95% CI 0.67–0.86), PPV = 0.69 (95% CI 0.59–0.81), NPV = 0.67 (95% CI 0.62–0.74)
-
Class 0 (NA): sensitivity = 0.77 (95% CI 0.67–0.86), specificity = 0.58 (95% CI 0.53–0.63), PPV = 0.67 (95% CI 0.62–0.74), NPV = 0.69 (95% CI 0.59–0.81)
Figure 1 shows the ROC curves of every fold and reports the average one. Power analysis indicated that the target accuracy of 0.85, in line with reliable clinical tools68, would be reached with a dataset 8 times larger. Further details can be found in Supplemental Information.
Feature relevance
We analyzed the SHAP summary plot to evaluate feature importance (Fig. 2). Features are ranked by relevance, with RH_rACC thickness identified as the most influential, followed by LH_PRS volume, WM-hypo volume, and SCL-90-R. Each point represents an individual’s feature value, where color denotes feature values (red: high, blue: low) and the x-axis indicates impact on model output. High RH_rACC thickness values cluster on the right, indicating strong association with SA classification. A similar pattern was observed for SCL-90-R and WM-hypo volume, both suggesting increased SA likelihood. In contrast, higher LH_PRS volumes were associated with NA classification.
Figure 3 shows the scaled feature values and corresponding SHAP contributions for a True Negative (TN) classification, i.e. a NA correctly classified as a NA. DRAMA-BPD base expectation for the output is 0.404, while low LH_PRS volume (scaled value = 0.127, on a range from 0 to 1) has the strongest positive contribution (+ 0.21), pushing the classification towards the SA class. Conversely, low RH_rACC thickness (scaled value = 0.296) contributes negatively (-0.14), pushing the classification towards the NA class. WM-hypo volume (scaled value = 0.209) and SCL-90-R (scaled value = 0.296) show smaller contributions (− 0.04 and + 0.01, respectively). The combined effect of these features results in a final value of 0.441, resulting in a correct NA classification. Figure 4 presents a similar analysis for a True Positive (TP) classification, i.e. a SA correctly classified as a SA. In this case, high RH_rACC thickness (scaled value = 0.794) strongly supports SA classification with a positive contribution (+ 0.34). SCL-90-R (scaled value = 0.140), LH_PRS volume (scaled value = 0.534), and WM-hypo volume (scaled value = 0.228) push the classification towards NA class with small contributions (-0.05, -0.03, and -0.02, respectively). The combined effect of these features results in a final value of 0.644, meaning a correct SA classification. These examples illustrate how individual features influence classification outcomes and highlight potential biases in the model’s decision-making process.
Contribution of each feature to the correct classification of a NA subject (True Negative). The numbers left to the features names are their scaled values, while values on the red/blue bars are the SHAP weights. Acronyms: RH_rACC, Right Hemisphere Rostral Anterior Cingulate; LH_PRS, Left Hemisphere Presubiculum; WM-hypo, White Matter hypointensities; SCL-90-R, Symptoms Check-list 90; NA: Non-Attempter.
Contribution of each feature to the correct classification of a SA subject (True Positive). The numbers left to the features names are their scaled values, while values on the red/blue bars are the SHAP weights. Acronyms: RH_rACC, Right Hemisphere Rostral Anterior Cingulate; LH_PRS, Left Hemisphere Presubiculum; WM-hypo, White Matter hypointensities; SCL-90-R, Symptoms Check-list 90; SA, Suicide Attempter.
Discussion
The goal of this study was to demonstrate that a multimodal approach, combined with robust data processing and ML techniques, can classify lifetime suicide attempters in people with BPD. To do this, we used sociodemographic, clinical, and MRI data from the CLIMAMITHE and SUDMEX_CONN studies to train the DRAMA-BPD model, an ensemble classifier of lifetime suicide attempters among people with BPD. Feature selection was applied to reduce the number of features and avoid overfit. The process was data-driven, and thus the importance of each feature was calculated by the RF model (see Methods for further details). Interestingly, the present study showed that neuroimaging has a key role in this classification. Indeed, among the 4 most influential variables, 3 are related to neuroimaging. DTI-derived features did not contribute to the model. This could be because DTI-derived features are sensitive to acquisition parameters69,70,71,72 and motion artifacts73, causing a limited classification contribution in this specific setup. SHAP was applied post-hoc to help interpret the results of the classifier.
Right hemisphere rostral anterior cingulate cortex
DRAMA-BPD’s explainability analysis using SHAP identified Right Hemisphere Rostral Anterior Cingulate Cortex (RH_rACC) thickness as the top contributor to model output. The Rostral Anterior Cingulate Cortex (rACC) is a medial frontal subregion implicated in emotion regulation, cognitive control, and decision-making74. Literature suggests a potential link between rACC structural alterations and suicidality in BPD. Soloff et al.75 reported reduced rACC Gray Matter (GM) in suicidal BPD individuals versus Healthy Controls (HCs), but found no rACC difference between BPD SAs and NAs, whereas an increased volume has been associated with aggression in high-lethality attempters76. Duarte et al.77 reported an association between increased rACC and suicide attempts in BD type 1 patients, suggesting a potential compensatory mechanism. In DRAMA-BPD, higher rACC thickness produced positive SHAP attributions, thereby increasing the probability estimate for SA class. While this result converges with reports linking rACC morphology to suicidal behaviour, findings remain inconsistent across studies, and further research is required to clarify these discrepancies. Possible explanations for increased GM volumes or thickness are neuroinflammatory processes, inefficient synaptic pruning mechanisms during neurodevelopment, or neuroplasticity-driven adaptations in response to early-life adversity, as discussed in78. Authors suggest that hyperactivity of ACC and insula may lead to changes in brain plasticity and hypertrophy as compensatory mechanisms79,80,81. It should be noted that studies77 and78 focused on volumetric analysis of the whole rACC, while in our case the feature extraction process identified rACC thickness, limiting the direct comparability of the two findings.
Left hemisphere presubiculum
The presubiculum, a subregion of the parahippocampus, is implicated in memory formation82. Several studies have documented hippocampal volume alterations in BPD. Brambilla et al.83 reported reduced hippocampal volumes, particularly in BPD subjects with childhood abuse, while Rossi et al.84 localized significant differences mainly to the CA1 sector (with differences reaching up to 10–20%). These studies used different field strengths (1.5 T) and segmentation protocols than ours (3 T), limiting direct comparison. Bøen et al.85 (3 T) analyzed and identified significant volume reductions in the dentate gyrus and cornu ammonis (CA), though the all-female sample used may limit generalizability. In a similar sample, O’Neill et al.86 observed reductions in the left hippocampal head and body, as well as the bilateral hippocampal tail. Ruocco et al.87 conducted a meta-analysis of 11 MRI studies, comprising 205 individuals with BPD and 222 HCs, revealing an average 11% reduction in hippocampal volume, a difference that was unaffected by psychotropic medication use. Our data revealed a significant difference in Left Hemisphere Presubiculum (LH_PRS) volume, with SAs showing lower average volumes than NAs. SHAP attributions show that lower LH_PRS volumes increase the probability that DRAMA-BPD classifies an individual as an SA. In summary, there is strong evidence to support smaller hippocampal volume as a marker for BPD, although studies investigating its direct relationship with suicidal behavior are lacking.
White-Matter hypointensities
Hypointensities in T1-weighted images, corresponding to hyperintensities in T2-weighted or FLAIR sequences, typically represent WM lesions. These can include various pathologic features or anatomic structures, such as calcifications, hemorrhages, gliosis, and scar tissues88, which are common findings in aging and dementia, but evidence of higher WM hypointensities in BPD compared to controls is lacking. This may denote incidental findings without clinical relevance, explaining why this variable contributes marginally to the model. Nevertheless, Grangeon et al.89 conducted a meta-analysis of MRI studies, and concluded that individuals with deep WM hyperintensities and periventricular hyperintensities have a higher association with suicidal behaviors, even though the underlying mechanisms remain unclear. Pompili et al.90 also showed that periventricular hyperintensities are strongly associated with suicide attempts. In our sample, larger volumes of WM-hypointensities were observed in SAs. SHAP attributions indicate that larger WM-hypo volume modestly increased the DRAMA-BPD probability of SA. In light of this, our findings are partially consistent with literature, since WM-hypointensities are associated with suicide attempts. However, a study that investigates this aspect specifically in the BPD population is lacking. This emphasizes the need for deeper investigation and analysis of how WM-hypo volume influences classification, and the SHAP-based association identified here should be considered hypothesis-generating.
Symptoms Check-list 90 Revised
The SCL-90-R is not specifically designed to assess BPD symptomatology and does not include items specific to suicide. However, it captures general trait-psychopathology across multiple domains, several of which are relevant to suicidal behavior32. These include depression, focusing on symptoms generally associated with an increased suicide risk (hopelessness, sadness, feelings of worthlessness), hostility and aggression (potentially linked to suicide in individuals with impulsivity or emotional regulation difficulties), anxiety, and psychoticism. In this study, we used the total score of the SCL-90-R scale as a feature for our dataset. Nevertheless, to compare our results with literature, we analyzed the subscales for statistical differences. Consistently with Lee et al.91 (Korean SCL-90-R92), we observed higher hostility and paranoia scores among SAs, although these differences did not reach statistical significance in our sample. SCL-90-R contributed modestly to DRAMA-BPD classifications according to SHAP, with higher subscale scores pushing the model toward a SA classification.
Summary and result interpretation
DRAMA-BPD shows moderate predictive performance, with a balanced accuracy of 0.68 (95% CI 0.63–0.72) and AUC of 0.68 (95% CI 0.63–0.72). While the confidence intervals for sensitivity (0.58, 95% CI 0.53–0.63) and other metrics indicate some overlap and uncertainty, particularly in identifying TPs, the model shows a clear signal above chance, especially for specificity (0.77, 95% CI 0.67–0.86). The moderate predictive performance could be mainly explained by the intrinsic complexity of predicting suicide attempts in BPD and limited sample size. Nevertheless, these results suggest that the classifier captures meaningful patterns in the data, proving that a multimodal signature could identify suicide attempters. These results are preliminary: the sample size is modest (N = 104), cohorts were cross-sectional and merged across sites, and no external prospective validation is available. Therefore, we frame our findings as hypothesis-generating and evidence that may support future works. Interestingly, the feature extraction process identified one clinical feature, while the remaining were all MRI-derived. This represents a potential change of perspective with regards to existing literature, which mostly relied on sociodemographic and clinical data only. However, this pattern should be interpreted cautiously and validated in independent datasets. ML model explainability can support the interpretability and potential clinical usefulness. This may be especially relevant in complex conditions such as BPD, where medical decisions may be influenced by both biological and behavioral data. By using SHAP, we identified the influence of single features on DRAMA-BPD, allowing us to move beyond black-box predictions to a more interpretable approach. This factor is especially important in clinical practice, where understanding a model’s prediction may help the clinician’s decision93,94. However, SHAP-based interpretability does not substitute for neurobiological investigations, which requires independent validation and targeted multimodal studies. For this reason, this evidence provides an initial foundation for XAI in BPD, though larger and more diverse datasets are needed to support clinical use.
Limitations
Overall, the robustness of the pipeline and validation strategy we employed reinforces the relevance of the identified brain regions in BPD, but some limitations must be addressed. The first is the small size of the analyzed dataset, although comparable with similar works from literature. This is caused by several factors, most notably the difficulties in recruiting people with BPD and the cost of MRI 3 T procedures. Regarding the CLIMAMITHE dataset, the outpatient nature and long duration of the interventions required a high level of adherence and commitment, which implies a certain rate of refusals and dropouts. In addition to that, the comprehensive clinical and neuroimaging assessments required by the study protocol were time-consuming and demanding, which further reduced the pool of eligible and willing participants. Our experience of slower recruitment aligns with previous findings in the BPD literature. Woo et al.95 report that stigma, external referral pathways, and the high procedural burden are key barriers to recruitment and retention in BPD research. Regarding MRI post-hoc harmonization tools, we acknowledge that our sample size is relatively small for multisite neuroimaging96, potentially limiting the power of harmonization to fully eliminate site-specific variance, and the lack of external validation prevents us from determining whether identified features generalize to new sites or scanners. Moreover, power analysis showed that to reach a target accuracy of 0.85, in line with reliable clinical tools, the dataset size would need to be enlarged to 800 individuals with BPD (balanced between SAs and NAs). This suggests that future work should prioritize expanding the sample size and identifying and validating more informative biomarkers, potentially represented by Affect-Modulated Startle97. From a ML perspective, the lack of validation on an independent dataset may limit the generalizability of the model. Future studies should incorporate comparisons with external, compatible datasets to assess the broader applicability of DRAMA-BPD. Regarding XAI analysis, because SHAP describes associations learned by the model rather than causation, models including these features appear to achieve higher accuracy, suggesting that they may capture patterns relevant to the classification. Nonetheless, these results should be treated as hypothesis-generating; replication in independent, prospective cohorts and targeted multimodal investigations are required before inferring biological mechanisms or clinical utility. An additional limitation is the lack of longitudinal validation for DRAMA-BPD. Indeed, our goal was not to train a suicide predictor; rather, it was to identify a multimodal signature that could potentially pave the way for developing tools that help clinicians forecast future suicidal behaviors. A longitudinal study could fill this gap, transforming the present classifier model to an actual predictor. Finally, future research could examine the interactions among the four identified features, optimize decision thresholds to reduce their influence on classification outcomes, and explore advanced modeling or feature-engineering strategies. We emphasize that this is a pilot exploratory analysis, thus our results primarily inform feasibility and estimates for future prospective validation studies rather than immediate clinical deployment.
Conclusions
Suicide is one of the leading causes of death among individuals with BPD, yet accurately identifying those at risk remains a major challenge. Existing ML approaches primarily focus on other psychiatric disorders, and often suffer from methodological limitations, including overfitting and limited model interpretability. In this pilot study, we developed DRAMA-BPD, an ensemble classifier combining SVM, RF, and NB, trained on a cross-sectional, multimodal dataset, formed by merging CLIMAMITHE and SUDMEX_CONN. To prioritize interpretability, we adopted a feature extraction strategy, with findings consistent with previous studies. SHAP was used to investigate feature contributions, providing an initial foundation for XAI in BPD lifetime suicide attempts classification. Because of the modest sample size, cross-sectional design, and absence of independent prospective validation, we frame these results as hypothesis-generating. DRAMA-BPD illustrates that multimodal, interpretable ML can identify patterns associated with a lifetime history of suicide attempt in BPD, but further work is needed to improve overall performance through larger samples and enriched clinical predictors, and to validate findings in external and prospective cohorts. Until external validation, DRAMA-BPD should be considered an exploratory tool that may help prioritize future research rather than a ready-to-use clinical instrument.
Data availability
The SUDMEX_CONN dataset is publicly available from OpenNeuro at https://openneuro.org/datasets/ds003037/versions/1.0.0. Access and reuse are subject to the terms and conditions posted on the OpenNeuro repository. The CLIMAMITHE dataset used in this study is available upon request through the NewPsy4U platform at https://newpsy4u.eu. Following registration, users may submit an access request via the platform interface, which will be reviewed and approved at the discretion of the PI, Roberta Rossi, last author of this paper.
References
Gunderson, J. G., Herpertz, S. C., Skodol, A. E., Torgersen, S. & Zanarini, M. C. Borderline personality disorder. Nat. Rev. Dis. Primers 4, 18029 (2018).
Bohus, M. et al. Borderline personality disorder. Lancet 398, 1528–1540 (2021).
Yen, S. et al. Association of borderline personality disorder criteria with suicide attempts: Findings from the collaborative longitudinal study of personality disorders over 10 years of follow-up. JAMA Psychiat. 78, 187–194 (2021).
Paris, J. Suicidality in borderline personality disorder. Medicina (Kaunas) 55, 223 (2019).
Guilé, J. M., Boissel, L., Alaux-Cantin, S. & de La Rivière, S. G. Borderline personality disorder in adolescents: Prevalence, diagnosis, and treatment strategies. Adolesc. Health Med. Ther. 9, 199–210 (2018).
Videler, A. C., Hutsebaut, J., Schulkens, J. E. M., Sobczak, S. & van Alphen, S. P. J. A life span perspective on borderline personality disorder. Curr. Psychiatry Rep. 21, 51 (2019).
Soloff, P. H., Lis, J. A., Kelly, T., Cornelius, J. & Ulrich, R. Risk factors for suicidal behavior in borderline personality disorder. Am. J. Psychiatry 151, 1316–1323 (1994).
Black, D. W., Blum, N., Pfohl, B. & Hale, N. Suicidal behavior in borderline personality disorder: Prevalence, risk fctors, prediction, and prevention. J. Pers. Disord. 18, 226–239 (2004).
D’Aurizio, G. et al. The role of emotional instability in borderline personality disorder: A systematic review. Ann. Gen. Psychiatry 22, 9 (2023).
Ribeiro, J. D. et al. Self-injurious thoughts and behaviors as risk factors for future suicide ideation, attempts, and death: A meta-analysis of longitudinal studies. Psychol. Med. 46(2), 225–236. https://doi.org/10.1017/S0033291715001804 (2016).
Pigoni, A. et al. Machine learning and the prediction of suicide in psychiatric populations: A systematic review. Transl. Psychiatry 14, 140 (2024).
Fortaner-Uyà, L. et al. A longitudinal prediction of suicide attempts in borderline personality disorder: A machine learning study. J. Clin. Psychol. 81(4), 222–236. https://doi.org/10.1002/jclp.23763 (2025).
Horvath, A., Dras, M., Lai, C. C. & Boag, S. Predicting suicidal behavior without asking about suicidal ideation: Machine learning and the role of borderline personality disorder criteria. Suicide Life Threat Behav. 51, 455–466 (2021).
Berisha, V. et al. Digital medicine and the curse of dimensionality. NPJ Digit. Med. 4, 153 (2021).
Chekroud, A. M. et al. Illusory generalizability of clinical prediction models. Science 383, 164–167 (2024).
Cabitza, F. et al. The importance of being external: Methodological insights for the external validation of machine learning models in medicine. Comput. Methods Programs Biomed. 208, 106288 (2021).
Jeni LA, Cohn JF, De La Torre F (2013): Facing imbalanced data: Recommendations for the use of performance metrics. In: 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, Geneva, Switzerland: IEEE, pp 245–251.
Su, R., John, J. R. & Lin, P. I. Machine learning-based prediction for self-harm and suicide attempts in adolescents. Psychiatry Res. 328, 115446 (2023).
Iorfino, F. et al. Predicting self-harm within six months after initial presentation to youth mental health services: A machine learning study. PLoS ONE 15, e0243467 (2020).
Lundberg SM, Lee S-I (2017): A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Red Hook, NY: Curran Associates Inc, pp 4768–4777.
Rossi, R. et al. Metacognitive interpersonal therapy in borderline personality disorder: Clinical and neuroimaging outcomes from the CLIMAMITHE study—A randomized clinical trial. Personal. Disord. 14, 452–466 (2023).
Magni, L. R. et al. Neurobiological and clinical effect of metacognitive interpersonal therapy versus structured clinical model: Study protocol for a randomized controlled trial. BMC Psychiatry 19, 195 (2019).
American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders: DSM-IV (American Psychiatric Association, 1994).
Skodol, A. E., Morey, L. C., Bender, D. S. & Oldham, J. M. The Alternative DSM-5 model for personality disorders: A clinical application. Am. J. Psychiatry 172(7), 606–613. https://doi.org/10.1176/appi.ajp.2015.14101220 (2015).
Garza-Villarreal, E. A. et al. SUDMEX_CONN: The Mexican dataset of cocaine use disorder patients. OpenNeuro https://doi.org/10.1038/s41597-022-01251-3 (2021).
Garza-Villarreal, E. A. et al. The effect of crack cocaine addiction and age on the microstructure and morphology of the human striatum and thalamus using shape analysis and fast diffusion kurtosis imaging. Transl. Psychiatry 7(5), e1122–e1122 (2017).
Rasgado-Toledo, Jalil, Apurva Shah, Madhura Ingalhalikar, and Eduardo A. Garza-Villarreal. 2021. “Neurite Orientation Dispersion and Density Imaging in Cocaine Use Disorder.” Progress in Neuro-Psychopharmacology & Biological Psychiatry, November, 110474.
Sheehan, D. V. et al. The mini-international neuropsychiatric interview (M.I.N.I.): The development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. J. Clin. Psychiatry 59(20), 22–57 (1998).
Giromini, L., Velotti, P., de Campora, G., Bonalume, L. & Cesare Zavattini, G. Cultural adaptation of the difficulties in emotion regulation scale: Reliability and validity of an Italian version. J. Clin. Psychol. 68(9), 989–1007. https://doi.org/10.1002/jclp.21876 (2012).
Chan, A. W. et al. SPIRIT 2013 explanation and elaboration: Guidance for protocols of clinical trials. BMJ 346, e7586 (2013).
Patton, J. H., Stanford, M. S. & Barratt, E. S. Factor structure of the Barratt impulsiveness scale. J. Clin. Psychol. 51(6), 768–774 (1995).
Derogatis LR (1994): Symptom Checklist-90-R: Administrative scoring and procedures manual. Minneapolis: NCS Pearson.
First, M. B., Gibbon, M., Spitzer, R. L., Williams, J. B. W. & Benjamin, L. S. The Structured Clinical Interview for DSM–IV Axis II Personality Disorders (SCID-II) (American Psychiatric Press, 1997).
Desikan, R. S. et al. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage 31(3), 968–980 (2006).
Ribaldi, F. et al. Accuracy and reproducibility of automated white matter hyperintensities segmentation with lesion segmentation tool: A European multi-site 3T study. Magn. Reson. Imaging 76, 108–115 (2021).
Fischl, B. et al. Whole brain segmentation: Automated labeling of neuroanatomical structures in the human brain. Neuron 33(3), 341–355 (2002).
Fischl, B. et al. Automatically parcellating the human cerebral cortex. Cereb. Cortex 14(1), 11–22 (2004).
Smith SM, Jenkinson M, Woolrich MW, Beckmann CF, Behrens TEJ, Johansen-Berg H, et al. (2004): Advances in functional and structural MR image analysis and implementation as FSL. NeuroImage 23 (SUPPL. 1).
Woolrich MW, Jbabdi S, Patenaude B, Chappell M, Makni S, Behrens T, et al. (2009): Bayesian analysis of neuroimaging data in FSL. NeuroImage 45(1 Suppl).
Iglesias, J. E. et al. A probabilistic atlas of the human thalamic nuclei combining ex vivo MRI and histology. Neuroimage 183, 314–326 (2018).
Saygin, Z. M. et al. High-resolution magnetic resonance imaging reveals nuclei of the human amygdala: Manual segmentation to automatic atlas. Neuroimage 155, 370–382 (2017).
Tournier, J. D. et al. MRtrix3: A fast, flexible and open software framework for medical image processing and visualisation. Neuroimage 202, 116137 (2019).
Maffei C, Lee C, Planich M, Ramprasad M, Ravi N, Trainor D, et al. (2021): Using diffusion MRI data acquired with ultra-high gradient strength to improve tractography in routine-quality data. NeuroImage 245.
Yendiki, A. et al. Automated probabilistic reconstruction of white-matter pathways in health and disease using an atlas of the underlying anatomy. Front. Neuroinform. 5, 23 (2011).
Pomponio, R. et al. Harmonization of large MRI datasets for the analysis of brain imaging patterns throughout the lifespan. Neuroimage 208, 116450 (2019).
Kleinman, P. H. et al. Psychopathology among cocaine abusers entering treatment. J. Nerv. Ment. Dis. 178, 442–447 (1990).
Marlowe, D. B. et al. Impact of comorbid personality disorders and personality disorder symptoms on outcomes of behavioral treatment for cocaine dependence. J. Nerv. Ment. Dis. 185, 483–490 (1997).
Breiman, L. Random Forests. Mach Learn 45, 5–32 (2001).
Craney, T. A. & Surles, J. G. Model-dependent variance inflation factor cutoff values. Qual. Eng. 14, 391–403. https://doi.org/10.1081/QEN-120001878 (2002).
Parvandeh, S., Yeh, H. W., Paulus, M. P. & McKinney, B. A. Consensus features nested cross-validation. Bioinformatics (Oxford) 36(10), 3093–3098. https://doi.org/10.1093/bioinformatics/btaa046 (2020).
Maćkiewicz, A. & Ratajczak, W. Principal components analysis (PCA). Comput. Geosci. 19(3), 303–342 (1993).
Cortes, C. & Vapnik, V. Support-vector networks. Mach Learn 20, 273–297 (1995).
John GH, Langley P (1995): Estimating continuous distributions in Bayesian classifiers. In: Proc. of the 11th Conference on Uncertainty in Artificial Intelligence (UAI-95), pp. 338–345.
Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proc. of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). Association for Computing Machinery.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.Y. (2017). LightGBM: a highly efficient gradient boosting decision tree. In Proc. of the 31st International Conference on Neural Information Processing Systems (pp. 3149–3157). Curran Associates Inc..
Anna Veronika Dorogush, Vasily Ershov, & Andrey Gulin. (2018). CatBoost: gradient boosting with categorical features support.
Lee, S.I., Lee, H., Abbeel, P., & Ng, A. (2006). EfficientL1regularized logistic regression. In Proc. of the 21st National Conference on Artificial Intelligence Volume 1 (pp. 401–408). AAAI Press
Shuvo, T. A. Machine learning-based prediction of suicidal ideation, plans, and attempts among school-going adolescents. Inf. Health 2(2), 143–157 (2025).
Zhu, R. et al. Discriminating suicide attempters and predicting suicide risk using altered frontolimbic resting-state functional connectivity in patients with bipolar II disorder. Front. Psych. 11, 597770. https://doi.org/10.3389/fpsyt.2020.597770 (2020).
Bunnell, B. E. et al. Automated detection and prediction of suicidal behavior from clinical notes using deep learning. PLoS ONE 20(9), e0331459. https://doi.org/10.1371/journal.pone.0331459 (2025).
Torlay, L., Perrone-Bertolotti, M., Thomas, E. & Baciu, M. Machine learning-XGBoost analysis of language networks to classify patients with epilepsy. Brain Inf. 4(3), 159–169. https://doi.org/10.1007/s40708-017-0065-7 (2017).
Qi, X., Xu, W. & Li, G. Neuroimaging study of brain functional differences in generalized anxiety disorder and depressive disorder. Brain Sci. 13(9), 1282. https://doi.org/10.3390/brainsci13091282 (2023).
Luo, X. et al. Integrating EEG and ensemble learning for accurate grading and quantification of generalized anxiety disorder: A novel diagnostic approach. Diagnostics (Basel) 14(11), 1122. https://doi.org/10.3390/diagnostics14111122 (2024).
Wu, Y. et al. Detection of functional and structural brain alterations in female schizophrenia using elastic net logistic regression. Brain Imaging Behav. 16(1), 281–290. https://doi.org/10.1007/s11682-021-00501-z (2022).
Balki, I. et al. Sample-size determination methodologies for machine learning in medical imaging research: A systematic review. Can Assoc. Radiol. J. 70(4), 344–353 (2019).
Rezaei, Z. et al. Assessment of risk factors for suicidal behavior: results from the Tehran university of medical sciences employees’ cohort study. Front. Public Health 11, 1180250. https://doi.org/10.3389/fpubh.2023.1180250 (2023).
Ribeiro, J. D., Huang, X., Fox, K. R. & Franklin, J. C. Depression and hopelessness as risk factors for suicide ideation, attempts and death: Meta-analysis of longitudinal studies. Br. J. Psychiatry J. Ment. Sci. 212(5), 279–286. https://doi.org/10.1192/bjp.2018.27 (2018).
Hicks, S. A. et al. On evaluation metrics for medical applications of artificial intelligence. Sci. Rep. 12(1), 5979. https://doi.org/10.1038/s41598-022-09954-8 (2022).
Kim, S. J. et al. Effects of MR parameter changes on the quantification of diffusion anisotropy and apparent diffusion coefficient in diffusion tensor imaging: Evaluation using a diffusional anisotropic phantom. Korean J. Radiol. 16(2), 297–303. https://doi.org/10.3348/kjr.2015.16.2.297 (2015).
Takemura, H., Kruper, J. A., Miyata, T. & Rokem, A. Tractometry of human visual white matter pathways in health and disease. Magn. Reson. Med. Sci. MRMS Off. J. Jpn. Soc. Magn. Reson. Med. 23(3), 316–340. https://doi.org/10.2463/mrms.rev.2024-0007 (2024).
Shahim, P., Holleran, L., Kim, J. H. & Brody, D. L. Test-retest reliability of high spatial resolution diffusion tensor and diffusion kurtosis imaging. Sci. Rep. 7(1), 11141. https://doi.org/10.1038/s41598-017-11747-3 (2017).
Yao, X. et al. Effect of increasing diffusion gradient direction number on diffusion tensor imaging fiber tracking in the human brain. Korean J. Radiol. 16(2), 410–418. https://doi.org/10.3348/kjr.2015.16.2.410 (2015).
Baum, G. L. et al. The impact of in-scanner head motion on structural connectivity derived from diffusion MRI. Neuroimage 173, 275–286. https://doi.org/10.1016/j.neuroimage.2018.02.041 (2018).
Tang, W. et al. A connectional hub in the rostral anterior cingulate cortex links areas of emotion and cognitive control. Elife 8, e43761 (2019).
Soloff, P. H. et al. Structural brain abnormalities and suicidal behavior in borderline personality disorder. J. Psychiatr. Res. 46(4), 516–525 (2012).
Soloff, P., White, R. & Diwadkar, V. A. Impulsivity, aggression and brain structure in high and low lethality suicide attempters with borderline personality disorder. Psychiatry Res. 222(3), 131–139 (2014).
Duarte, D. G. G. et al. Structural brain abnormalities in patients with type I bipolar disorder and suicidal behavior. Psychiatry Res. Neuroimaging 265, 9–17 (2017).
Chiang, J. J., Taylor, S. E. & Bower, J. E. Early adversity, neural development, and inflammation. Dev. Psychobiol. 57(8), 887–907. https://doi.org/10.1002/dev.21329 (2015).
Fears SC, Schür R, Sjouwerman R, Service SK, Araya C, Araya X, et al. (2015): Brain structure–function associations in multi-generational families genetically enriched for bipolar disorder. Brain 138(7): 2087–2102.
Javadapour, A. et al. Increased anterior cingulate cortex volume in bipolar I disorder. Aust N Z J. Psychiatry 41(11), 910–916. https://doi.org/10.1080/00048670701634978 (2007).
Lisy, M. E. et al. Progressive neurostructural changes in adolescent and adult patients with bipolar disorder. Bipolar Disord. 13(4), 396–405 (2011).
Simonnet, J. & Fricker, D. Cellular components and circuitry of the presubiculum and its functional role in the head direction system. Cell Tissue Res. 373(3), 541–556 (2018).
Brambilla, P. et al. Anatomical MRI study of borderline personality disorder patients. Psychiatry Res. 131(2), 125–133 (2004).
Rossi, R. et al. Volumetric and topographic differences in hippocampal subdivisions in borderline personality and bipolar disorders. Psychiatry Res. 203(2–3), 132–138 (2012).
Bøen, E. et al. Smaller stress-sensitive hippocampal subfields in women with borderline personality disorder without posttraumatic stress disorder. J. Psychiatry Neurosci. 39(2), 127–134 (2014).
O’Neill, A. et al. Magnetic resonance imaging in patients with borderline personality disorder: A study of volumetric abnormalities. Psychiatry Res. 213(1), 1–10 (2013).
Ruocco, A. C., Amirthavasagam, S. & Zakzanis, K. K. Amygdala and hippocampal volume reductions as candidate endophenotypes for borderline personality disorder: A meta-analysis of magnetic resonance imaging studies. Psychiatry Res. 201(3), 245–252 (2012).
Wei, K. et al. White matter hypointensities and hyperintensities have equivalent correlations with age and CSF β-amyloid in the nondemented elderly. Brain Behav. 9(12), e01457 (2019).
Grangeon, M. C. et al. White matter hyperintensities and their association with suicidality in major affective disorders: A meta-analysis of magnetic resonance imaging studies. CNS Spectr. 15(6), 375–381 (2010).
Pompili, M. et al. Periventricular white matter hyperintensities as predictors of suicide attempts in bipolar disorders and unipolar depression. Prog. Neuropsychopharmacol. Biol. Psychiatry 32(6), 1501–1507. https://doi.org/10.1016/j.pnpbp.2008.05.009 (2008).
Lee, Y. J. et al. Defense mechanisms and psychological characteristics according to suicide attempts in patients with borderline personality disorder. Psychiatry Investig. 17(8), 840–849. https://doi.org/10.30773/pi.2020.0102 (2020).
Kim, K. I., Kim, J. H. & Won, H. T. Korean Version of Symptom Checklist-90-Revised (SCL-90-R) Professional Manual (ChoongAng Aptitude Publishing, 1984).
Tun, H. M., Rahman, H. A., Naing, L. & Malik, O. A. Trust in artificial intelligence-based clinical decision support systems among health care workers: systematic review. J. Med. Internet Res. 27, e69678. https://doi.org/10.2196/69678 (2025).
Frasca, M. et al. Explainable and interpretable artificial intelligence in medicine: A systematic bibliometric review. Discov. Artif. Intell. 4, 15. https://doi.org/10.1007/s44163-024-00114-7 (2024).
Woo, J. et al. Factors affecting participant recruitment and retention in borderline personality disorder research: A feasibility study. Pilot Feasibility Stud. 7(1), 178. https://doi.org/10.1186/s40814-021-00915-y (2021).
Parekh, P., Vivek Bhalerao, G., ADBS consortium, John, J. P., & Venkatasubramanian, G. (2022). Sample size requirement for achieving multisite harmonization using structural brain MRI features. NeuroImage, 264, 119768. https://doi.org/10.1016/j.neuroimage.2022.119768
Hazlett, E. A. et al. Hyperreactivity and impaired habituation of startle amplitude during unpleasant pictures in borderline but not schizotypal personality disorder: Quantifying emotion dysregulation. Biol. Psychiatry 92(7), 573–582 (2022).
Funding
This work was supported by Ricerca Corrente of the Italian Ministry of Health for Research of Centro San Giovanni di Dio Fatebenefratelli Institute as part of the project entitled “Utilizzo di strumenti di Intelligenza Artificiale (AI) per l’analisi dei disturbi psichici”.
Author information
Authors and Affiliations
Contributions
Claudio Crema: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Writing—original draft, Writing—review & editing.—Alberto Boccali: Conceptualization, Data curation, Methodology, Software, Writing—original draft, Writing—review & editing.—Alessandra Martinelli: Methodology, Data curation, Writing—original draft, Writing—review & editing.—Silvia De Francesco: Conceptualization, Data curation, Methodology, Writing—review & editing.—Serena Meloni: Methodology, Data curation.—Cesare M. Baronio: Conceptualization, Methodology, Writing—review & editing.—Roberto Gasparotti: Methodology, Data curation.—Laura Pedrini: Methodology, Data curation.—Mariangela Lanfredi: Methodology, Data curation.—Michela Pievani: Methodology, Data curation , Writing—review & editing.—Antonino Carcione: Methodology, Data curation.—Giuseppe Nicolò: Methodology, Data curation.—Antonino Semerari: Methodology, Data curation.—Damiano Archetti: Conceptualization, Methodology, Writing—review & editing.—Alberto Redolfi: Conceptualization, Methodology, Validation, Supervision, Software, Writing—review & editing.—Roberta Rossi: Conceptualization, Data curation, Funding acquisition, Methodology, Supervision, Validation, Writing—review & editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Crema, C., Boccali, A., Martinelli, A. et al. An explainable multimodal artificial intelligence model for classifying suicide attempters with borderline personality disorder: a pilot study. Sci Rep 16, 1902 (2026). https://doi.org/10.1038/s41598-025-31550-9
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-31550-9






