Predicting mental health disparities using machine learning for African Americans in Southeastern Virginia

Moudden, Ismail El; Bittner, Michael C.; Karpov, Matvey V.; Osunmakinde, Isaac O.; Acheamponmaa, Akosua; Nevels, Breshell J.; Mbaye, Mamadou T.; Fields, Tonya L.; Jordan, Karthiga; Bahoura, Messaoud

doi:10.1038/s41598-025-89579-9

Download PDF

Article
Open access
Published: 18 February 2025

Predicting mental health disparities using machine learning for African Americans in Southeastern Virginia

Ismail El Moudden¹,
Michael C. Bittner¹,
Matvey V. Karpov¹,
Isaac O. Osunmakinde²,
Akosua Acheamponmaa³,
Breshell J. Nevels⁴,
Mamadou T. Mbaye⁵,
Tonya L. Fields²,
Karthiga Jordan⁵ &
…
Messaoud Bahoura⁵

Scientific Reports volume 15, Article number: 5900 (2025) Cite this article

3521 Accesses
2 Citations
Metrics details

Subjects

Abstract

This study examined mental health disparities among African Americans using AI and machine learning for outcome prediction. Analyzing data from African American adults (18–85) in Southeastern Virginia (2016–2020), we found Mood Affective Disorders were most prevalent (41.66%), followed by Schizophrenia Spectrum and Other Psychotic Disorders. Females predominantly experienced mood disorders, with patient ages typically ranging from late thirties to mid-forties. Medicare coverage was notably high among schizophrenia patients, while emergency admissions and comorbidities significantly impacted total healthcare charges. Machine learning models, including gradient boosting, random forest, neural networks, logistic regression, and Naive Bayes, were validated through 100 repeated 5-fold cross-validations. Gradient boosting demonstrated superior predictive performance among all models. Nomograms were developed to visualize risk factors, with gender, age, comorbidities, and insurance type emerging as key predictors. The study revealed higher mental health disorder prevalence compared to national averages, suggesting a potentially greater mental health burden in this population. Despite the limitations of its retrospective design and regional focus, this research provides valuable insights into mental health disparities among African Americans in Southeastern Virginia, particularly regarding demographic and clinical risk factors.

Using normative modelling to detect disease progression in mild cognitive impairment and Alzheimer’s disease in a cross-sectional multi-cohort study

Article Open access 03 August 2021

Machine learning to investigate policy-relevant social determinants of health and suicide rates in the United States

Article 12 May 2025

Early detection of mental health disorders using machine learning models using behavioral and voice data analysis

Article Open access 13 May 2025

Introduction

Mental health disorders represent a significant public health concern, affecting approximately 19.86% of adults in the United States (U.S.) annually, which translates to nearly 50 million Americans. Of these, 4.91% experience severe mental illness^1,2,3. While mental health disorders impact individuals across diverse racial, ethnic, and gender demographics, certain groups face disproportionate burdens in both prevalence and impact^4,5,6,7.

African Americans, comprising 13.6% of the U.S. population, experience unique challenges in mental health care. Socioeconomic factors exacerbate these disparities, with 20.1% of African Americans living in poverty ⁸ and 10.8% lacking health insurance⁹. While African Americans experience mental illness at rates similar to the general population, they face significant barriers to accessing quality mental health care. Only one in three African Americans in need receives mental health care, with lower rates of service use compared to non-Hispanic whites¹⁰. These disparities stem from various factors, including racial and ethnic biases¹¹, stigma¹², limited access to care due to financial and geographic constraints, historical trauma¹³, distrust of the healthcare system¹⁴, poorer quality of care, and lack of culturally competent services¹⁵ Moreover, African Americans are less likely to receive guideline-consistent care, are underrepresented in research, and are more likely to use emergency rooms or primary care for mental health needs^16,17,18,19.

Diagnostic disparities further complicate the landscape, with African Americans more frequently diagnosed with schizophrenia and less frequently with mood disorders compared to whites presenting with similar symptoms²⁰. Additionally, African Americans with mental health conditions, particularly schizophrenia, bipolar disorders, and other psychoses, face higher rates of incarceration than individuals of different races^21,22. Factors such as gender, age, complications, comorbidities, insurance type, and admission source shape mental health outcomes within this group^18,23,24,25. African Americans exhibit higher rates of mental health disorders due to psychosocial stressors such as marital problems, involvement with the justice system, abuse, and financial crises^{26,27,28,29,30}. Challenges such as inadequate assessment tools and biases in clinical decision-making impede accurate reporting of mental health symptoms among African Americans^{29,31,32,33,34,35}.

Despite the growing body of research on mental health disparities, there remains a significant gap in our understanding of the specific patterns, predictors, and outcomes of mental health disorders among African Americans in regionally defined areas. This study aims to address this gap by leveraging a comprehensive dataset from Southeastern Virginia, employing advanced analytical techniques to identify specific trends that can inform targeted interventions and policy decisions. The primary objective of this study was to employ artificial intelligence (AI) and machine learning (ML) methodologies to analyze patterns and predictors of mental health outcomes among underserved populations in the Southeastern Virginia region, with an emphasis on African American communities. Specifically, the study aimed to (a) Examine the impact of various factors (including gender, age, complications, comorbidities, insurance type, and admission source) on mental health outcomes in the Southeastern Virginia area and (b) Develop comprehensive prediction models for mental health outcomes using advanced machine learning techniques.

By leveraging a large, comprehensive dataset from the VHI system, this study seeks to address critical gaps in the literature and provide valuable insights into the unique mental health challenges faced by African Americans. The findings aim to inform targeted interventions and health policies to reduce mental health disparities and improve outcomes for this underserved population, contributing to a more equitable and effective mental health care system.

Materials and methods

This study employs a quantitative, cross-sectional design using retrospective data from 2016 to 2020. The analysis incorporates traditional statistical methods and innovative machine-learning techniques to examine patterns and predictors of mental health outcomes for African Americans in Southeastern Virginia.

Ethics

The study was approved by the Eastern Virginia Medical School (EVMS) Institutional Review Board and Human Subjects’ Protection (IRB #23-07-NH-0174), which determined that it did not involve human subjects research and was therefore exempt from IRB review. Due to the retrospective nature of the study, a waiver of informed consent was granted by the EVMS Institutional Review Board and Human Subjects’ Protection, and all patient data were deidentified to maintain confidentiality. Data were received via secure transfer and stored on password-protected devices accessible only to authorized research team members. All research methods followed the guidelines and regulations set forth by the EVMS IRB and Human Subjects’ Protection committee. The research team extracted demographic, administrative, clinical, and financial data from the VHI database, including data on comorbidities. To ensure data safety throughout the project, deidentified data were securely transferred via a secure File Transfer Protocol behind the EVMS firewall during the collection phase. After collection, data were stored on password-protected devices with access restricted to authorized team members, and regular backups were performed.

Data collection

This study used aggregated hospital discharge data from VHI, a comprehensive healthcare information repository. The dataset provided by the Virginia Department of Health focuses on mental health among underserved populations in Southeastern Virginia. VHI consolidates data from various sources, ensuring accuracy and objectivity. It includes medical and pharmacy claims for around five million Virginia residents covered by commercial, Medicaid, and Medicare plans. The database covers patient demographics, care locations, provider details, diagnoses, and service costs. VHI integrates data from both commercial and public insurance carriers, representing most of Virginia’s insured population, and collaborates with the Department of Medical Assistance Services and nine commercial carriers to ensure data quality and compliance³⁶.

Study population

The target population includes African American adults aged 18 to 85 years residing in the Southeastern Virginia region who sought mental healthcare services between 2016 and 2020. The extracted data comprised demographic information, comorbidities, clinical characteristics, and hospital details. Each discharge record contained one primary diagnosis code, often accompanied by multiple additional codes reflecting the patient’s mental health status. Diagnoses in the VHI system are based on ICD-10 codes assigned by healthcare providers during patient encounters. To ensure diagnostic accuracy and consistency, the study utilized the ICD-10 Classification of Mental and Behavioural Disorders: Clinical Descriptions and Diagnostic Guidelines³⁷ as the reference for diagnostic criteria. This standardized classification system ensures consistency in coding across the dataset and aligns with international diagnostic standards. Table 1 summarizes the ICD-10 codes used in this study, categorized by significant mental health disorders:

Table 1 ICD-10 codes for mental health disorders used in the study.

Full size table

This comprehensive categorization of mental health disorders using standardized ICD-10 codes enables a detailed and reliable analysis of mental health patterns and trends within the study population. By adhering to these international diagnostic standards, the study ensures comparability with other research and enhances the validity of its findings.

Statistical analysis plan

All statistical analyses were conducted in collaboration with Research and Infrastructure Service Enterprise at EVMS. Data analysis was conducted using a combination of R, Python, and SAS to capitalize on the unique strengths of each software. R (tidyverse package) was employed for data cleaning and initial exploratory analyses, enabling efficient data preprocessing and visualization. Python (pandas, numpy, scipy.stats, scikit-learn and statsmodels libraries) was utilized for implementing and evaluating various machine learning models, leveraging its extensive libraries and frameworks for predictive modeling. SAS was used to conduct complex statistical procedures.

Data cleaning involves checking for and addressing missing or inconsistent data. Standardized procedures were applied to handle missing data, including imputation methods where appropriate. The study employed a diverse range of statistical methods encompassing both traditional statistical analysis and machine learning to extract meaningful insights from the patient dataset. Descriptive statistics were conducted for several parameters to identify relevant factors for the research question. Demographic factors such as sex and age, clinical factors including complications or comorbidities, and administrative aspects like admission status and length of stay (LOS) were explored. Frequencies were run for all categorical parameters, while means, standard deviations, and medians were calculated for all numeric parameters. Chi-square testing was conducted for the analysis of categorical variables. Given the nonparametric nature of the data, Kruskal-Wallis and Wilcoxon tests were implemented when analyzing numeric data. These tests evaluated differences in demographic, clinical, and administrative factors across mental and behavioral diseases and disorders (MBDD) groups (e.g., mood affective disorders (MAD), schizophrenia, schizotypal and delusional disorders (SSDD), mental and behavioral disorders due to psychoactive substance use (MBD), and neurotic, stress-related and somatoform disorders (NSRS)). Significant levels were defined as follows: ***: p-value < 0.0001; **: 0.0001 ≤ p-value < 0.01; *: 0.01 ≤ p-value < 0.05.

Machine learning techniques

To develop robust mental health outcome prediction models for MBDD, machine learning techniques were implemented using Python’s scikit-learn library³⁸. The ML models included gradient boosting (GB), random forest (RF), artificial neural network (ANN), logistic regression (LR), and Naive Bayes (NB). The selection of ML models was based on their diverse strengths and suitability for the study’s objectives. GB and RF, as ensemble methods, can effectively handle complex interactions and nonlinearities in the data. ANN is powerful in capturing intricate patterns and dependencies. LR, as a probabilistic classifier, provides interpretable results and is widely used in healthcare settings. NB, despite its simplicity, can serve as a robust baseline. This combination of models allows for a comprehensive evaluation of predictive performance and insights into the underlying data structure.

Hyperparameter tuning was performed using grid search with 5-fold cross-validation. The optimal hyperparameters for each model were:

GB: learning_rate = 0.1, n_estimators = 100, max_depth = 3.

RF: n_estimators = 100, max_depth = None, min_samples_split = 2.

ANN: hidden_layer_sizes=(100,), solver=’adam’, alpha = 0.0001.

LR: penalty=’l2’, C = 1.0, solver=’lbfgs’.

Naive Bayes: default hyperparameters.

The performance of these models was assessed using a comprehensive set of evaluation metrics, including area under the curve (AUC), correct classification (CA), F-measure or F-score (F1), Precision (Prec), and Recall: Sensitivity or the true positive rate (Recall). Models were validated through a rigorous approach consisting of 100 repeated 5-fold cross-validations to ensure reliability and accuracy in distinguishing between classes and predicting outcomes.

Predictive nomograms

Predictive nomograms were developed using the Logistic Regression classifier to integrate demographic, clinical, and administrative predictors for MAD, MBD, and SSDD. These nomograms provided a visual representation of the risk factors and their respective contributions to the probability of each disorder. The top ten predictors for each disorder were identified, and their respective weights were calculated to aid in clinical decision-making. The nomograms serve as quantitative tools, enabling clinicians to assess the probability of specific mental disorders based on a comprehensive profile of individual risk factors.

Results

Prevalence rates

Table 2 Mental and behavioral diseases and disorders.

Full size table

Table 2 provides a breakdown of the prevalence of various MBDD among discharged patients within the Southeastern Virginia area. The total number of readmissions recorded was 22,254. MAD was the most common, constituting approximately 41.66% of the cases, followed closely by SSDD, which represented about 39.57%. MBD accounted for 14.30% of the readmissions, while NSRS comprised 4.46% of the total.

Demographic, administrative, clinical, and comorbidity characteristics

Table 3 Demographic, administrative, clinical, and comorbidity characteristics of mental and behavioral disorders.

Full size table

Table 3 details the demographic, administrative, clinical, and comorbidity characteristics of patients diagnosed with various MBDDs. Females predominantly constitute the patient population for MAD and NSRS, with percentages of 54.54% and 56.50%, respectively. In contrast, SSDD and MBD are less prevalent among females.

The mean age of patients across disorders hovers around the late thirties to mid-forties. Emergency admissions are the most common across all MBDD, particularly pronounced in the MBD group at 71.28%. When examining insurance types, a significant proportion of SSDD patients are covered by Medicare (34.21%), whereas a higher percentage of MBD patients utilize Medicaid (26.08%). Regarding comorbidity profiles, SSDD patients tend to have fewer comorbidities, with 28.24% having none, while MBD patients show a higher prevalence of multiple comorbidities. Specifically, 10.37% of MBD patients present five or more comorbidities. This table also reveals significant data on the (LOS), with SSDD patients experiencing the most extended stays, averaging 8.54 days. The geographical distribution indicates that Norfolk and Virginia Beach are prominent locations for these patients, suggesting regional variations in the prevalence or treatment availability of mental health conditions.

Table 4 Comparative analysis of demographic, clinical, and administrative differences in mental and behavioral disorders.

Full size table

Table 4 presents a comparative analysis of demographic, clinical, and administrative characteristics across MBDD groups. The analysis highlights statistically significant gender differences, with SSDD showing the most considerable disparity (24.1%, p < 0.0001) between males and females. Age differences also show significant variances; however, statistically significant differences are noted in MAD (2.9 ± 0.2, p < 0.0001) and MBD (4.4 ± 1, p < 0.0001). Medicare coverage significantly differs across groups, with notable differences in SSDD (31.6%, p < 0.0001) and MBD (57.3%, p < 0.0001), indicating distinct patterns in insurance utilization. The comparison of patients with and without primary procedures reveals significant findings in all groups, particularly MAD (57.5%, p < 0.0001), SSDD (60.6%, p < 0.0001), and MBD (7.1%, p < 0.0001). Complication rates are consistently high across all groups but do not reach statistical significance, suggesting a general trend of high complication rates irrespective of specific disorders. Emergency admission types show significant differences, especially in MBD (42.6%, p < 0.0001), emphasizing the urgency in admissions for this group. LOS analysis further emphasizes gender differences, particularly in SSDD, where males exhibit a significantly longer LOS compared to females (2.1 ± 2, p < 0.0001). This indicates a more complex clinical pathway for males in this group. Post-operative LOS also reflects significant gender differences in NSRS (0.9 ± 2, p < 0.05) and SSDD (2.1 ± 2, p < 0.0001), suggesting differential recovery times based on gender (table 3).

Total charge differences in patients

Table 5 Total charge groups differences (%) with procedure.

Full size table

Table 5 examines the impact of various factors on total charge differences for patients with MBDD who underwent procedures. In the gender category, a notable increase in charges is observed for SSDD in male patients compared to female patients (5.8%, p < 0.0001). Medicare recipients generally see higher charges, with significant increases noted in the MAD (7.5%, p < 0.0001) and SSDD (16.8%, p < 0.0001) groups. Complication presence corresponds to an increase in total charges, with a substantial effect seen in MBD and SSDD, although it did not reach statistical significance. Different admission types also show significant differences in charges, with emergency admissions generally resulting in higher costs compared to urgent and elective, especially in NSRS (43.0%, p < 0.0001) and MBD (33.8%, p < 0.0001). The number of comorbidities correlates with charge differences, where more comorbidities typically lead to higher charges, notably in MBD, with a 25.5% increase when moving from 4 to 5 + comorbidities (p < 0.0001) (table 5).

Table 6 Total charge groups differences (%) without procedure.

Full size table

Table 6 explores total charge differences for patients without procedures across MBDD. Gender differences are particularly stark in NSRS, with females incurring 33.4% higher charges than males (p < 0.05). For Medicare, all groups show significantly higher recipient charges, especially in SSDD (48.2%, p < 0.0001). The absence of complications, particularly in NSRS, dramatically lowers charges, highlighting the cost impact of managing complications in mental health care. Differences in admission types are less pronounced here than in table 4 but still significant, with emergency versus elective admissions showing enormous disparities, particularly in MBD (30.0%, p < 0.0001). As in table 4, an increase in comorbidities consistently correlates with higher charges, especially in NSRS moving from 4 to 5 + comorbidities with a 77.3% increase (p < 0.01).

AI and ML models performance

Table 7 Performance metrics of machine learning models for predicting mental and behavioral disorders.

Full size table

Table 7 presents the performance metrics for various AI and ML models that predict outcomes for MAD, MBD, and SSDD. The models evaluated include GB, LR, ANN, and RF. These models were rigorously validated using 100 repeated 5-fold cross-validations, and their performance was assessed based on area under the curve (AUC), correct classification (CA), F1 score, Precision (Prec), and recall.

For MBD, the Gradient Boosting model demonstrated the highest performance with an AUC of 0.955 and a CA of 0.929, along with robust F1 (0.747), precision (0.79), and recall (0.709) scores, indicating its superior predictive capability. LR and ANN also showed strong performance with AUCs of 0.937 and 0.936, respectively, and similar CA and precision metrics. In the MAD category, the GB model again led in performance with an AUC of 0.832 and a balanced F1 score of 0.719, reflecting its reliability in prediction. However, the overall performance metrics for MAD were lower compared to MBD, suggesting potential complexity in modeling MAD outcomes. For SSDD, the GB model achieved the highest AUC at 0.832 and an F1 score of 0.709, highlighting its effectiveness. The LR and ANN models also performed well but exhibited slightly lower metrics across all evaluation criteria. Overall, Gradient Boosting consistently outperformed other models across all disorder categories, particularly excelling in MBD predictions (Table 7).

Predictive nomograms results

The study developed predictive nomograms for three of the most prevalent MBDDs: MAD, MBD, and SSDD, each depicted in Figs. 1 and 2, and 3, respectively. These nomograms were developed using Logistic Regression classifiers to integrate demographic, clinical, and administrative predictors, estimating the probability of each disorder.

For MAD (Fig. 1), the most robust predictors include alcohol and drug use, with significant regional differences, notably residents from Poquoson City contributing the highest points. For MBD (Fig. 2), psychological factors and age played pivotal roles, with older individuals showing a higher likelihood of MBD, and clinical factors such as complications and liver function underscored the interplay between physical and mental health.

The SSDD nomogram (Fig. 3) identified psychiatric symptoms and drug use as the top predictors, along with impactful demographic factors like insurance type and county of residence. Specific insurance types like Medicare and regions like Suffolk City were associated with higher probabilities of SSDD.

Discussion

Our study provides significant insights into the patterns and predictors of mental health outcomes among underserved African American adults in Southeastern Virginia, contributing to the broader field of mental health disparities research. The findings align with and expand upon existing literature, offering a comprehensive understanding of the complex interplay between demographic, clinical, and socioeconomic factors in shaping mental health outcomes in this population.

Our results indicate that MAD is the most prevalent (41.66%), followed by SSDD) and MBD. This prevalence pattern is consistent with national data^39,40,41,42, though our higher rates suggest a potentially more significant mental health burden in our study population. This finding underscores the critical need for targeted interventions in this region.

Key predictors of mental health outcomes identified in our study include gender, age, comorbidities, and insurance type. Females predominantly constituted the patient population for MAD and NSRS, aligning with previous research showing higher rates of mood and anxiety disorders among women^{43,44,45,46,47}. The mean age of patients across disorders was in the late thirties to mid-forties, with a trend of decreasing mental disorders with age, consistent with findings by⁴⁸ and ⁴⁹. The significant proportion of SSDD patients covered by Medicare highlights the role of insurance type in mental health outcomes, echoing findings of^50,51,52.

Our study revealed significant differences in total charges based on demographic and clinical factors, particularly for patients with comorbidities and emergency admissions. This finding emphasizes the economic impact of these variables on mental and behavioral disorder care costs, aligning with research by^53,54, and⁵⁵. These insights can guide healthcare policy and clinical practice in optimizing care delivery and managing healthcare costs for underserved populations.

ML algorithms have revolutionized mental health diagnostics, offering diverse methodological approaches with varying degrees of effectiveness^{56,57,58,59,60}. These computational tools enable personalized prediction of mental health outcomes, facilitating targeted interventions across diverse populations.

Our study’s principal contribution lies in applying sophisticated ML techniques to an extensive regional dataset. The implementation of GB, RF, and LR models, complemented by predictive nomograms, provides a robust empirical framework for understanding mental health trajectories.

In the broader literature, Support Vector Machines (SVM) and Random Forests have emerged as leading classification methods⁶¹, while Convolutional Neural Networks (CNNs) have achieved superior accuracy in bipolar disorder diagnosis^62,63,64. Gradient Boosting algorithms have demonstrated enhanced predictive capabilities through their iterative error-learning mechanisms^65,66.

The field faces persistent methodological challenges, particularly concerning data quality and diagnostic heterogeneity, resulting in variable model performance across research groups⁶². Model performance varies significantly based on the specific mental health condition, data modality (clinical documentation, patient-reported outcomes, neuroimaging), and algorithmic selection^{62,65,67,68,69}. Despite these constraints, ML approaches consistently demonstrate improved diagnostic and predictive accuracy compared to conventional methodologies, particularly in analyzing complex, large-scale datasets.

While our study focused on African Americans in Southeastern Virginia, the findings have broader implications. The higher prevalence of mental health disorders in our study population compared to national averages highlights potential disparities that may exist in other underserved communities. This emphasizes the importance of region-specific research and tailored interventions to address mental health disparities effectively.

Our study lays the groundwork for future research in several key areas. First, validating these findings in broader populations could provide insights into the generalizability of our results. Second, exploring the effectiveness of interventions tailored to the specific needs of underserved communities, as identified by our predictive models, could lead to more effective mental health care strategies. Finally, further investigation into the economic implications of mental health disparities could inform policy decisions and resource allocation.

In conclusion, our study contributes novel insights to mental health disparities research through its comprehensive analysis of mental health patterns, application of advanced machine learning techniques, and focus on an underserved population. These findings have the potential to inform targeted interventions and personalized care strategies, representing a significant step forward in addressing mental health disparities among African Americans in Southeastern Virginia and potentially in other underserved communities.

Policy implications and recommendations

Our findings have several important implications for mental health policy and practice in Southeastern Virginia:

(1)
Targeted screening and intervention programs should be developed, particularly for Mood Affective Disorders and Schizophrenia, Schizotypal, and Delusional Disorders, which were found to be most prevalent.
(2)
Healthcare providers should be trained to recognize and address the unique mental health needs of African American patients, considering the gender and age-related patterns identified in our study.
(3)
Efforts should be made to improve insurance coverage and access to mental health services, given the significant impact of insurance type on mental health outcomes and healthcare utilization.
(4)
Community-based mental health programs should be strengthened, particularly in areas identified as having higher risk factors.
(5)
Future mental health initiatives should adopt data-driven approaches, utilizing predictive models to identify at-risk individuals and tailor interventions accordingly.

Strengths

Our in-depth examination of 28 comorbid illnesses offers a nuanced view of the complex healthcare needs within this population. This comprehensive approach provides a more holistic understanding of the interplay between mental health and other medical conditions, informing more integrated care strategies. The application of machine learning models, including GB, RF, and LR, represents a methodological advancement in mental health research. These techniques allowed us to identify subtle patterns and predictors that might be overlooked by traditional statistical methods, offering new perspectives on risk factors and potential intervention points.

By exploring gender and age differences in mental health prevalence and readmission rates, our study highlights subgroups that may require tailored interventions. This granular analysis contributes to a more personalized approach to mental health care and policy development. Our findings have substantial implications for health policy, particularly in addressing healthcare disparities and improving mental health outcomes. The study provides evidence-based recommendations for targeted interventions, potentially influencing policy decisions to enhance healthcare access and equity for African Americans.

The development of predictive nomograms represents a significant contribution to clinical practice. These tools can assist healthcare providers in identifying high-risk individuals and tailoring interventions, potentially improving patient outcomes and resource allocation. By focusing on Southeastern Virginia, our study provides locally relevant insights that can inform targeted interventions and policy decisions specific to this region while also offering a model for similar region-specific analyses elsewhere.

While we acknowledge the limitations inherent in our cross-sectional design, the strengths of our study significantly outweigh these constraints. The comprehensive nature of our dataset, coupled with advanced analytical techniques and a focus on an underserved population, positions our research as a valuable contribution to the field of mental health disparities. Our findings not only enhance the current understanding of mental health challenges among African Americans but also pave the way for future longitudinal studies that can build upon this foundation to examine causal relationships and long-term trends.

Limitations

Our study, while providing valuable insights into mental health disparities among African Americans in Southeastern Virginia, is subject to several limitations that warrant consideration when interpreting the results.

Firstly, The implementation of machine learning algorithms in mental health diagnostics presents distinct methodological challenges and limitations across different model architectures⁷⁰. GB, while adept at handling complex interactions, requires meticulous tuning of critical hyperparameters, including learning rate, tree depth, and boosting iterations, to mitigate overfitting risks and optimize the bias-variance trade-off⁷¹. RF exhibits robustness against overfitting but demonstrates sensitivity to hyperparameters such as tree count and feature split threshold, potentially compromising computational efficiency and model interpretability in high-dimensional datasets⁷². ANNs present unique challenges in hyperparameter optimization, particularly in architecting optimal layer structures and selecting appropriate activation functions. Their performance significantly depends on training data quality and requires careful tuning of layer composition, neuron count, learning rate, and regularization parameters⁷³. LR, while offering superior interpretability, faces limitations with non-linear relationships and necessitates precise feature selection and regularization parameter optimization to address multicollinearity concerns^74,75. Naive Bayes classifiers, despite their computational efficiency, are constrained by their fundamental assumption of feature independence, which rarely holds in clinical settings. While requiring less intensive hyperparameter tuning, these models demand careful consideration of prior selection and zero-frequency handling⁷⁶.

These algorithmic limitations are particularly pronounced in mental health applications due to the inherent complexity and heterogeneity of psychiatric data, emphasizing the necessity for robust cross-validation and systematic hyperparameter optimization protocols.

Secondly, our reliance on retrospective, pre-existing hospital discharge data introduces potential biases related to data collection and documentation practices. This approach may lead to misclassification or underreporting of conditions, as we were unable to control the data collection process. Consequently, specific nuances and contextual factors that could influence mental health outcomes may have been overlooked. The retrospective nature of the data also limits our ability to establish causal relationships between the identified factors and mental health outcomes.

Thirdly, while our use of ICD-10 codes for diagnosis enhances the validity and reliability of our findings by providing a standardized framework, it may not capture the full complexity of mental health conditions. Diagnostic accuracy can be influenced by factors such as clinician expertise, cultural competence, and the specific manifestation of symptoms in different populations. Future studies could benefit from incorporating structured diagnostic interviews or additional clinical assessments to further validate these ICD-10-based diagnoses.

Fourthly, the geographic specificity of our dataset to Southeastern Virginia, while providing valuable local insights, limits the generalizability of our findings to other U.S. regions. Mental health disparities and their underlying factors may vary across different geographic and cultural contexts, necessitating caution when extrapolating our results to other populations.

Lastly, our approach to handling missing data by excluding instances with less than 0.001% missing values, while pragmatic, may have overlooked subtle patterns or biases. This could potentially affect the accuracy and completeness of our analysis, mainly if the missing data were not randomly distributed across the dataset.

Despite these limitations, our study makes significant contributions to the understanding of mental health disparities among African Americans. The large, comprehensive dataset and advanced analytical techniques employed provide robust insights into patterns and predictors of mental health outcomes in this underserved population. These findings have important implications for health policy and clinical practice, informing targeted interventions aimed at improving mental health equity and outcomes for African Americans. Furthermore, our study lays a foundation for future research, highlighting areas where more in-depth, longitudinal studies could further elucidate the complex factors influencing mental health disparities.

Conclusions

Our study provides significant insights into the patterns and predictors of mental health outcomes among African American adults in Southeastern Virginia, leveraging an extensive and comprehensive dataset from the VHI system. Through robust statistical analyses and advanced predictive modeling, we have uncovered critical findings contributing to understanding mental health disparities in this underserved population. The high prevalence of MAD and SSDD within our study population aligns with global mental health patterns, underscoring the urgent need for targeted interventions. Our research has identified key demographic and clinical predictors, including gender, age, comorbidities, and insurance type, which significantly influence mental health outcomes. These findings reaffirm the importance of these factors in addressing mental health disparities and provide a foundation for developing more effective, personalized interventions. A significant strength of our study lies in the application of advanced machine learning techniques, particularly Gradient Boosting, which demonstrated high accuracy and reliability in predicting mental health outcomes. This approach not only enhances the precision of our findings but also showcases the potential of these analytical techniques to revolutionize mental health research and clinical practice. The development of predictive nomograms further translates our research into practical tools for clinicians, enabling more accurate assessment of individual risk profiles for specific mental disorders. Our study’s focus on an underrepresented population, coupled with the use of a comprehensive dataset and detailed comorbidity analysis, provides a nuanced understanding of mental health disparities among African Americans. These insights have significant implications for health policy and the development of targeted interventions. While we acknowledge limitations such as the retrospective design, regional specificity, and potential biases in handling missing data, the value of our contributions to the field of mental health research remains substantial. In conclusion, this study offers a detailed examination of mental health outcomes among African Americans in Southeastern Virginia, identifying key predictors and demonstrating the power of machine learning in predictive modeling for mental health. Our findings support the development of targeted health policies and interventions to reduce mental health disparities and improve outcomes for underserved populations. Moving forward, we recommend validating these findings in broader and more diverse populations to enhance the generalizability and impact of our conclusions. This research not only advances our understanding of mental health disparities but also paves the way for more equitable and effective mental health care strategies for African American communities and potentially other underserved populations.

Data availability

The datasets used during the current study are not publicly available due to privacy or ethical restrictions but are available from the corresponding author on reasonable request. Further information about the dataset and conditions for access can be provided by contacting Dr. Ismail El Moudden at elmoudi@odu.edu. The data are held under the terms stipulated by the Virginia Health Information (VHI) database, which prohibits public sharing of the data to protect patient confidentiality and comply with legal restrictions.

Abbreviations

AI:: Artificial intelligence
ML:: Machine learning
VHI:: Virginia health information
VDH:: Virginia Health Department
EVMS:: Eastern Virginia Medical School
MBDD:: Mental and behavioral diseases & disorders
MAD:: Mood affective disorder
SSDD:: Schizophrenia, schizotypal and delusional disorders
MBD:: Mental and behavioral disorders due to psychoactive substance use
NSRS:: Neurotic, stress-related, and somatoform disorders
GB:: Gradient boosting
RF:: Random forest
ANN:: Artificial neural network
LR:: Logistic regression
AUC:: Area under the curve
CA:: Correct classification
F1:: F-measure or F-score
Prec:: Precision
Recall:: Recall: sensitivity or the true positive rate
LOS:: Length of stay
Op:: Operative

References

Reinert, M., Fritze, D. & Nguyen, T. The State of Mental Health in America 2022. (2021).
The 2021 National Survey on Drug Use and Health (NSDUH) Report. (2021).
Mental Health By the Numbers. (2024).
McKnight-Eily, L. R. et al. Racial and ethnic disparities in the prevalence of stress and worry, Mental Health conditions, and increased substance use among adults during the COVID-19 pandemic — United States, April and May 2020. MMWR Morb Mortal. Wkly. Rep. 70. (2021).
Wen, M. et al. Racial-ethnic disparities in psychological distress during the COVID-19 pandemic in the United States: the role of experienced discrimination and perceived racial bias. BMC Public Health 23. (2023).
Kammer-Kerwick, M., Cox, K., Purohit, I. & Watkins, S. C. The role of social determinants of health in mental health: an examination of the moderating effects of race, ethnicity, and gender on depression through the all of us research program dataset. PLOS Mental Health. 1, e0000015 (2024).
Article Google Scholar
Rivera, K. J., Zhang, J. Y., Mohr, D. C. & Wescott, A. B. & Bamgbose Pederson, A. A narrative review of Mental Illness Stigma Reduction interventions among African americans in the United States. J. Ment Health Clin. Psychol. 5, (2021).
Emily, A., Shrider & Creamer, J. Current Population Reports: Poverty in the United States: 2022. (2022).
Health Insurance Coverage and Access to Care Among Black Americans: Recent Trends and Key Challenges. (2024).
Dalencour, M. et al. The role of faith-based organizations in the Depression Care of African americans and hispanics in Los Angeles. Psychiatr. Serv. 68, 368–374 (2017).
Article PubMed Google Scholar
Olfson, M. et al. Racial-Ethnic Disparities in Outpatient Mental Health Care in the United States. Psychiatr. Serv. 74. (2023).
DeFreitas, S. C., Crone, T., DeLeon, M. & Ajayi, A. Perceived and Personal Mental Health Stigma in latino and African American College Students. Front. Public. Health 6, (2018).
Williams-Washington, K. N. & Mills, C. P. African American historical trauma: creating an inclusive measure. J. Multicult Couns. Devel 46, (2018).
Cox, K., Edwards, K. Black Americans Have a Clear Vision for Reducing Racism but Little Hope It Will Happen. (2022).
Rostain, A. L., Ramsay, R. & Waite, R. J. Cultural background and barriers to mental health care for African American Adults. J. Clin. Psychiatry 76 https://doi.org/10.4088/JCP.13008co5c (2015).
Mental Health: Culture, Race, and Ethnicity: A Supplement to Mental Health: A Report of the Surgeon General. (2001).
Alegría, M. et al. Disparity in depression treatment among racial and ethnic minority populations in the United States. Psychiatr. Serv. 59, (2008).
Cook, B., Le, Trinh, N. H., Li, Z., Hou, S. S. Y. & Progovac, A. M. Trends in racial-ethnic disparities in access to mental health care, 2004–2012. Psychiatr. Services 68 https://doi.org/10.1176/appi.ps.201500453 (2017).
Snowden, L. R., Catalano, R. & Shumway, M. Disproportionate use of psychiatric emergency services by African Americans. Psychiatr. Serv. 60 https://doi.org/10.1176/ps.2009.60.12.1664 (2009).
Bell, C. C., Jackson, W. M. & Bell, B. H. Misdiagnosis of African-americans with psychiatric issues - part II. J. Natl. Med. Assoc. 107, (2015).
Ann Carson, E. & Anderson, E. Prisoners in 2015. https://www.bjs.gov/content/pub/pdf/p15.pdf (2016).
Hawthorne, W. B. et al. Incarceration among adults who are in the public mental health system: Rates, risk factors, and short-term outcomes. Psychiatr. Serv. 63, (2012).
Jones, S. C. T. & Neblett, E. W. Jr. The impact of racism on the mental health of people of color. in Eliminating race-based mental health disparities: Promoting equity and culturally responsive care across settings. (2019).
Assari, S. & Caldwell, C. H. Mental health service utilization among black youth; psychosocial determinants in a national sample. Children 4, (2017).
Yelton, B. et al. Social Determinants of Health and Depression among African American Adults: A Scoping Review of Current Research. Int. J. Environ. Res. Public Health 19 https://doi.org/10.3390/ijerph19031498 (2022).
Muntaner, C., Ng, E., Vanroelen, C., Christ, S. & Eaton, W. W. Social Stratification, Social Closure, and Social Class as determinants of Mental Health disparities. Handbooks Sociol. Social Res. https://doi.org/10.1007/978-94-007-4276-5_11 (2013).
Article Google Scholar
Sellers, S. L., Bonham, V., Neighbors, H. W. & Amell, J. W. Effects of racial discrimination and health behaviors on mental and physical health of middle-class African American men. Health Educ. Behav. 36. https://doi.org/10.1177/1090198106293526 (2009).
Williams, D. R. The Health of Men: Structured Inequalities and Opportunities. Am. J. Public Health 93 https://doi.org/10.2105/AJPH.93.5.724 (2003).
Watkins, D. C. & Johnson, N. C. Age and gender differences in psychological distress among African americans and whites: findings from the 2016 national health interview survey. Healthcare 6. (2018).
Taylor, I. COVID-19 and Mental Health Disparities in the Black American Population. HCA Healthc. J. Med. 3, (2022).
Neighbors, H. W., Trierweiler, S. J., Ford, B. C. & Muroff, J. R. Racial differences in DSM diagnosis using a semi-structured instrument: the importance of clinical judgment in the diagnosis of African americans. J. Health Soc. Behav. 44, (2003).
Snowden, L. R. Bias in mental health assessment and intervention: Theory and evidence. Am. J. Public Health 93 https://doi.org/10.2105/AJPH.93.2.239 (2003).
Jimenez, D. E. et al. Centering Culture in Mental Health: Differences in Diagnosis, Treatment, and Access to Care Among Older People of Color. Am. J. Geriatr. Psychiatry 30 https://doi.org/10.1016/j.jagp.2022.07.001 (2022).
Sun, M., Oliwa, T., Peek, M. E. & Tung, E. L. Negative patient descriptors: documenting racial Bias in the Electronic Health Record. Health Aff. 41. (2022).
Gopal, D. P., Chetty, U., O’Donnell, P., Gajria, C. & Blackadder-Weinstein, J. Implicit bias in healthcare: clinical practice, research and decision making. Future Healthc. J. 8. (2021).
Virginia Health Information (VHI). Patient Level Data. Va. Health Inform. (VHI) https://www.vhi.org/Products/patientleveldata.asp (2021).
World Health Organization. World Health Organization: (1992). The ICD-10 Classification of Mental and Behavioural Disorders: Clinical Descriptions and Diagnostic Guidelines. Geneva, (1992).
https://scikit-learn.org/stable/
Lori, A. et al. Genetic risk for hospitalization of African American patients with severe mental illness reveals HLA loci. Front. Psychiatry 15, (2024).
Moran, M. Overdiagnosis of Schizophenia Said to be persistent among black patients. Psychiatr. News 50, (2015).
Breslau, J. et al. Specifying race-ethnic differences in risk for psychiatric disorder in a USA national sample. Psychol. Med. 36, (2006).
Bantjes, J. et al. Prevalence and sociodemographic correlates of common mental disorders among first-year university students in post-apartheid South Africa: implications for a public mental health approach to student wellness. BMC Public Health 19. (2019).
Asher, M., Asnaani, A. & Aderka, I. M. Gender differences in social anxiety disorder: A review. Clin. Psychol. Rev. 56 https://doi.org/10.1016/j.cpr.2017.05.004 (2017).
Christiansen, D. M. Examining Sex and Gender Differences in Anxiety Disorders. in A Fresh Look at Anxiety Disorders. https://doi.org/10.5772/60662 (2015).
Farhane-Medina, N. Z., Luque, B., Tabernero, C. & Castillo-Mayén, R. Factors associated with gender and sex differences in anxiety prevalence and comorbidity: A systematic review. Sci. Progr. 105 https://doi.org/10.1177/00368504221135469 (2022).
Kessler, R. C. et al. Lifetime prevalence and age-of-onset distributions of DSM-IV disorders in the national comorbidity survey replication. Arch.Gen. Psychiatry 62 https://doi.org/10.1001/archpsyc.62.6.593 (2005).
Park, H. Y. Sex/Gender differences in depression and anxiety disordersle. in Sex/Gender-Specific Medicine in Clinical Areas. Springer, Singapore, 369–379. (2024).
Chapter MATH Google Scholar
Belayneh, Z., Mekuriaw, B., Mehare, T., Shumy, S. & Tsehay, M. Magnitude and predictors of common mental disorder among people with HIV/AIDS in Ethiopia: a systematic review and meta-analysis. BMC Public Health 20, (2020).
Skapinakis, P., Weich, S., Lewis, G., Singleton, N. & Araya, R. Socioeconomic position and common mental disorders: longitudinal study in the general population in the UK. Br. J. Psychiatry 189, (2006).
Wang, P. S., Berglund, P. & Kessler, R. C. Recent care of common mental disorders in the United States: prevalence and conformance with evidence-based recommendations. J. Gen. Intern. Med. 15, 284–292 (2000).
Article CAS PubMed PubMed Central MATH Google Scholar
Walker, E. R., Cummings, J. R., Hockenberry, J. M. & Druss, B. G. Insurance status, use of mental health services, and unmet need for mental health care in the United States. Psychiatr. Serv. 66. (2015).
Khaykin, E., Eaton, W., Ford, D., Anthony, C. & Daumit, G. Health Insurance Coverage among persons with Schizophrenia in the United States. Psychiatr. Serv. 61. (2010).
Shim, R. S. et al. Emergency department utilization among Medicaid beneficiaries with schizophrenia and diabetes: the consequences of increasing medical complexity. Schizophr. Res. 152. (2014).
Niedzwiecki, M. J., Sharma, P. J., Kanzaria, H. K., McConville, S. & Hsia, R. Y. Factors Associated with Emergency Department Use by patients with and without Mental Health diagnoses. JAMA Netw. Open. 1, (2018).
Osman, W., Ncube, F., Shaaban, K. & Dafallah, A. Prevalence, predictors, and economic burden of mental health disorders among asylum seekers, refugees and migrants from African countries: a scoping review. PLoS One. 19, 1–27 (2024).
Article Google Scholar
Rahman, A. et al. H. Machine Learning-Based Prediction of Mental Well-Being Using Health Behavior Data from University Students. Bioengineering 10. (2023).
Razavi, M., Ziyadidegan, S. & Sasangohar, F. Machine Learning Techniques for Prediction of Stress-Related Mental Disorders: A Scoping Review. Proc. Hum. Factors Ergon. Soc. Annu. Meet. 66, (2022).
Shatte, A. B. R., Hutchinson, D. M. & Teague, S. J. Machine learning in mental health: A scoping review of methods and applications. Psychol. Med. 49 https://doi.org/10.1017/S0033291719000151 (2019).
Thieme, A., Belgrave, D. & Doherty, G. Machine Learning in Mental Health: A systematic review of the HCI literature to support the development of effective and implementable ML Systems. ACM Trans.Comput. Hum. Interaction 27 https://doi.org/10.1145/3398069 (2020).
Sau, A. & Bhakta, I. Predicting anxiety and depression in elderly patients using machine learning technology. Healthc. Technol. Lett. 4, 238–243 (2017).
Article MATH Google Scholar
Iyortsuun, N. K., Kim, S. H., Jhon, M., Yang, H. J. & Pant, S. A review of Machine Learning and Deep Learning approaches on Mental Health diagnosis. Healthcare 11, 285 (2023).
Article PubMed PubMed Central Google Scholar
Madububambachu, U., Ukpebor, A. & Ihezue, U. Machine Learning Techniques to Predict Mental Health Diagnoses: a systematic literature review. Clin. Pract. Epidemiol. Mental Health 20. (2024).
Garriga, R. et al. Machine learning model to predict mental health crises from electronic health records. Nat. Med. 28, 1240–1248. (2022).
Article CAS PubMed PubMed Central MATH Google Scholar
Abd-alrazaq, A. et al. The performance of artificial intelligence-driven technologies in diagnosing mental disorders: an umbrella review. NPJ Digit. Med. 5, 87 (2022).
Article PubMed PubMed Central MATH Google Scholar
Bieliński, A., Rojek, I. & Mikołajewski, D. Comparison of Selected Machine Learning Algorithms in the analysis of Mental Health indicators. Electronics. 12, 4407 (2023).
MATH Google Scholar
Dhariwal, N. et al. A pilot study on AI-driven approaches for classification of mental health disorders. Front. Hum. Neurosci. 18. (2024).
Park, K. K., Saleem, M., Al-Garadi, M. A. & Ahmed, A. Machine learning applications in studying mental health among immigrants and racial and ethnic minorities: an exploratory scoping review. BMC Med. Inf. Decis. Mak. 24, 298 (2024).
Article Google Scholar
Lee, E. E. et al. Artificial Intelligence for Mental Health Care: clinical applications, barriers, facilitators, and Artificial Wisdom. Biol. Psychiatry Cogn. Neurosci. Neuroimaging. 6, 856–864 (2021).
PubMed PubMed Central MATH Google Scholar
Chung, J. & Teo, J. Mental Health Prediction Using Machine Learning: Taxonomy, Applications, and Challenges. Appl. Comput. Intell. Soft Comput. 1–19. (2022).
Putra, D. P. et al. Accuracy of Comparison Random Forest, Gradient Boosting Tree, Decision Tree, and Naïve Bayes Algorithms in Predicting the Size of Companies Where Data Scientist Works. In 5th International Conference on Cybernetics and Intelligent System (ICORIS) IEEE, 1–7. (2023). https://doi.org/10.1109/ICORIS60118.2023.10352260
Singh, U., Rizwan, M., Alaraj, M. & Alsaidan, I. A. Machine learning-based gradient boosting Regression Approach for wind power production forecasting: a step towards Smart Grid environments. Energies. 14, 5196 (2021).
Article MATH Google Scholar
Wang, S. et al. Application of bayesian hyperparameter optimized Random Forest and XGBoost Model for Landslide susceptibility mapping. Front. Earth Sci. 9. (2021).
Sarker, I. H. Machine learning: algorithms, real-world applications and research directions. SN Comput. Sci. 2. 160 (2021).
Article PubMed PubMed Central MATH Google Scholar
Vatcheva, P. & Lee, M. K. Multicollinearity in regression analyses conducted in epidemiologic studies. Epidemiol. Open Access. 06. (2016).
Sztepanacz, J. L. & Houle, D. Regularized regression can improve estimates of multivariate selection in the face of multicollinearity and limited data. Evol. Lett. 8, 361–373 (2024).
Article PubMed PubMed Central Google Scholar
Gohari, K. et al. A bayesian latent class extension of naive Bayesian classifier and its application to the classification of gastric cancer patients. BMC Med. Res. Methodol. 23, 190 (2023).
Article ADS PubMed PubMed Central MATH Google Scholar

Download references

Acknowledgements

This research was generously funded by the Southeastern Virginia Biomedical Research Consortium (HRBRC)- Project Number 958830-008, 2023.

Funding

This research was generously funded by the Southeastern Virginia Biomedical Research Consortium (HRBRC)- Project Number 958830-008, 2023.

Author information

Authors and Affiliations

Eastern Virginia Medical School (EVMS), Norfolk State University, Norfolk, VA, USA
Ismail El Moudden, Michael C. Bittner & Matvey V. Karpov
Computer Science Department, Norfolk State University, Norfolk, VA, USA
Isaac O. Osunmakinde & Tonya L. Fields
Business Department, Norfolk State University, Norfolk, VA, USA
Akosua Acheamponmaa
Department Ethelyn R. Strong School of Social Work, Norfolk State University, Norfolk, VA, USA
Breshell J. Nevels
Engineering Department and the Center for Materials Research, Norfolk State University, Norfolk, VA, 23504, USA
Mamadou T. Mbaye, Karthiga Jordan & Messaoud Bahoura

Authors

Ismail El Moudden
View author publications
Search author on:PubMed Google Scholar
Michael C. Bittner
View author publications
Search author on:PubMed Google Scholar
Matvey V. Karpov
View author publications
Search author on:PubMed Google Scholar
Isaac O. Osunmakinde
View author publications
Search author on:PubMed Google Scholar
Akosua Acheamponmaa
View author publications
Search author on:PubMed Google Scholar
Breshell J. Nevels
View author publications
Search author on:PubMed Google Scholar
Mamadou T. Mbaye
View author publications
Search author on:PubMed Google Scholar
Tonya L. Fields
View author publications
Search author on:PubMed Google Scholar
Karthiga Jordan
View author publications
Search author on:PubMed Google Scholar
Messaoud Bahoura
View author publications
Search author on:PubMed Google Scholar

Contributions

Writing – original draft: I.E.M., M.B.; Project administration: I.E.M., I.O., A.A., M.B.; Methodology: I.E.M., M.C.B., M.V.K., I.O., A.A., M.B.; Investigation: I.E.M., M.C.B., M.V.K., I.O., A.A., B.N., M.M., T.F., M.B.; Formal analysis: I.E.M., M.C.B., M.B.; Data curation: I.E.M., M.C.B.; Conceptualization: I.E.M., M.B.; Software: I.E.M., M.C.B.; Supervision: I.E.M., I.O., A.A., M.B.; Resources: I.E.M., M.C.B., M.V.K., I.O., A.A., M.B.; Funding acquisition: I.E.M., M.C.B., I.O., A.A., B.N., M.M., T.F., K.J., M.B.; Validation: I.E.M., M.B.; Visualization: I.E.M., M.V.K., M.B.; Writing – review & editing: I.E.M., M.C.B., M.V.K., I.O., A.A., B.N., M.M., T.F., K.J., M.B.

Corresponding author

Correspondence to Messaoud Bahoura.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Moudden, I.E., Bittner, M.C., Karpov, M.V. et al. Predicting mental health disparities using machine learning for African Americans in Southeastern Virginia. Sci Rep 15, 5900 (2025). https://doi.org/10.1038/s41598-025-89579-9

Download citation

Received: 30 October 2024
Accepted: 06 February 2025
Published: 18 February 2025
DOI: https://doi.org/10.1038/s41598-025-89579-9

Subjects

Abstract

Similar content being viewed by others

Using normative modelling to detect disease progression in mild cognitive impairment and Alzheimer’s disease in a cross-sectional multi-cohort study

Machine learning to investigate policy-relevant social determinants of health and suicide rates in the United States

Early detection of mental health disorders using machine learning models using behavioral and voice data analysis

Introduction

Materials and methods

Ethics

Data collection

Study population

Statistical analysis plan

Machine learning techniques

Predictive nomograms

Results

Prevalence rates

Demographic, administrative, clinical, and comorbidity characteristics

Total charge differences in patients

AI and ML models performance

Predictive nomograms results

Discussion

Policy implications and recommendations

Strengths

Limitations

Conclusions

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links