Integration of fairness-awareness into clinical language processing models

Abulibdeh, Rawan; Lin, Yihang; Ahmadi, Sepehr; Sejdić, Ervin; Celi, Leo Anthony; Zhao, Qiuyi; Tu, Karen

doi:10.1038/s43856-026-01433-9

Article
Open access
Published: 24 February 2026

Integration of fairness-awareness into clinical language processing models

Communications Medicine , Article number: (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

Background

Equitable deployment of clinical artificial intelligence systems requires consistent performance across diverse patient populations. However, race information in electronic health records is often missing/inconsistently documented, limiting the ability to construct representative cohorts or assess algorithmic bias. This study evaluates model performance and fairness in predicting race from clinical text.

Methods

We compared four transformer-based deep learning models with a hierarchical convolutional neural network designed to capture the multilevel structure of clinical narratives. A two-phase active learning framework guided annotation of a primary care database. A fairness-aware loss function was applied to mitigate disparities across racial groups. Each model was trained with and without fairness-aware optimization. Performance and equity were evaluated using 10-fold cross-validation and subgroup audits across race, sex, age, and their intersections.

Results

Here we show that the hierarchical convolutional neural network achieves higher accuracy and performance equity than transformer models (macro F1 = 98.4%). Fairness constraints enhance parity across most transformer architectures, but degrade hierarchical model performance and cause one clinical model to collapse toward majority predictions, demonstrating that fairness interventions are highly model dependent. Persistent disparities across race, sex, and age indicate that inequities reflect architectural limitations and systemic biases.

Conclusions

This study demonstrates that fairness can be integrated into clinical language models, though effects vary by model type. Architectures aligned with clinical text structure inherently promote fairness, yet mixed fairness constraint outcomes highlight the need for tailored interventions. Persistent demographic disparities show that algorithmic bias often reflects upstream documentation inequities. This framework offers a scalable path toward equitable NLP for clinical artificial intelligence.

Plain Language Summary

Medical records often lack information about patients’ race, making it hard to identify potential race-associated health inequalities. We developed computer programs to find race information in doctors’ notes. We tested different types of artificial intelligence models and added special rules to make them work fairly for all racial groups. We found that a model designed to read notes the way doctors write them worked best. Adding additional fairness rules helped some models but hurt others, showing there is no one-size-fits-all solution. Many differences we saw came from how doctors write their notes differently for different patient groups. This research shows we can build fairer medical artificial intelligence, but fixing computer programs alone is not enough. We also need to improve how health information is recorded.

Data availability

The data used in this study are individual-level, de-identified electronic health record data. Policies, procedures, and Research Ethics Board (REB) regulations governing the source data prohibit public release of individual-level data; only aggregate data are permitted for disclosure. The nature of the data used in this particular project is such that there is no way to aggregate the data for public release. The dataset was derived from the University of Toronto Practice-based Research Network’s (UTOPIAN) Data Safe Haven, a large primary care EHR repository encompassing over 400 clinics and 400,000 patients in Ontario, Canada. The parent database has been archived and is not currently accessible. Access to the dataset may be considered in the future upon request and approval by the University of Toronto Health Sciences REB. Requests for data access should be directed to the Human Research Ethics Unit at ethics.review@utoronto.ca or to the research ethics coordinator, Mariya Gancheva (m.gancheva@utoronto.ca). Requests will be reviewed within approximately four weeks and are subject to applicable institutional data use agreements. All data are stored securely on encrypted institutional servers within the University of Toronto Data Safe Haven environment. All aggregate numerical source data underlying the main and Supplementary Figs. are provided in Supplementary data 1 (Excel), which is sufficient to reproduce the analyses and visualizations presented in this paper. Numerical data underlying Figure 7 (provider-level proportions) are not publicly shared due to potential re-identification risk under UTOPIAN Data Safe Haven REB policy.

Code availability

All models were implemented using the PyTorch framework (version 2.3.1+cu121)⁸⁶, with transformer-based architectures developed using the HuggingFace Transformers library (version 4.37.1)⁸⁷. Model development and analysis were conducted in Python 3.10.12 using NumPy 1.26.4, pandas 2.1.1, scikit-learn 1.4.dev0, Matplotlib 3.8.1, Seaborn 0.13.0, and NLTK 3.8.1. Training was performed on an NVIDIA Quadro RTX 6000 GPU using CUDA 12.2 (driver version 535.247.01). Hyperparameters and training configurations for all models are provided in the Methods section and summarized in Table 1. The code for the active learning pipeline used for data annotation is publicly available at https://github.com/seperahm/EMR_Race_Classification. The remaining modeling code, developed for model training and fairness-aware loss implementation, are stored within the secure University of Toronto Data Safe Haven environment alongside the study data and cannot currently be exported for public release following archival of the environment under institutional privacy and security regulations. All transformer-based models used are standard, publicly available pre-trained architectures, and the hierarchical CNN—the primary methodological contribution of this work—is fully specified in the Methods section, including architectural details, optimized hyperparameters, and training procedures, enabling independent reimplementation.

Researchers seeking further methodological clarification or architecture-level guidance may contact the corresponding author for additional details or code review under appropriate data-sharing agreements.

References

Ford, M. E. & Kelly, P. A. Conceptualizing and categorizing race and ethnicity in health services research. Health Serv. Res. 40, 1658–1675 (2005).
Google Scholar
Prus, S. G. Comparing social determinants of self-rated health across the United States and Canada. Soc. Sci. Med. 73, 50–59 (2011).
Google Scholar
Morris, S. M. et al. Predictive modeling for clinical features associated with neurofibromatosis type 1. Neurol. Clin. Pract. 11, e497–e505 (2021).
Google Scholar
Brown, T. H., O’Rand, A. M. & Adkins, D. E. Race–ethnicity and health trajectories: Tests of three hypotheses across multiple groups and health outcomes. J. Health Soc. Behav. 53, 359–377 (2012).
Google Scholar
Lubetkin, E. I., Jia, H., Franks, P. & Gold, M. R. Relationship among sociodemographic factors, clinical conditions, and health-related quality of life: Examining the EQ-5D in the US general population. Qual. Life Res. 14, 2187–2196 (2005).
Google Scholar
Lingren, T. et al. Developing an algorithm to detect early childhood obesity in two tertiary pediatric medical centers. Appl. Clin. Inform. 7, 693–706 (2016).
Google Scholar
Ahuja, Y. et al. Leveraging electronic health records data to predict multiple sclerosis disease activity. Ann. Clin. Transl. Neurol. 8, 800–810 (2021).
Google Scholar
Franks, P., Gold, M. R. & Fiscella, K. Sociodemographics, self-rated health, and mortality in the US. Soc. Sci. Med. 56, 2505–2514 (2003).
Google Scholar
Freeman, H. P. The meaning of race in science–considerations for cancer research: Concerns of special populations in the national cancer program. Cancer.: Interdiscip. Int. J. Am. Cancer. Soc. 82, 219–225 (1998).
Google Scholar
Davidson, J., Vashisht, R. & Butte, A. J. From genes to geography, from cells to community, from biomolecules to behaviors: The importance of social determinants of health. Biomolecules 12, 1449 (2022).
Google Scholar
Bucher, B. T. et al. Determination of marital status of patients from structured and unstructured electronic healthcare data. In AMIA Annu. Symp. Proc., vol. 2019, 267–274 (2019).
Han, S. et al. Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing. J. Biomed. Inform. 127, 103984 (2022).
Google Scholar
Sholle, E. T. et al. Underserved populations with missing race ethnicity data differ significantly from those with structured race/ethnicity documentation. J. Am. Med. Inform. Assoc. 26, 722–729 (2019).
Google Scholar
Polubriaginof, F. C. et al. Challenges with quality of race and ethnicity data in observational databases. J. Am. Med. Inform. Assoc. 26, 730–736 (2019).
Google Scholar
Proumen, R., Connolly, H., Debick, N. A. & Hopkins, R. Assessing the accuracy of electronic health record gender identity and REal data at an academic medical center. BMC Health Serv. Res. 23, 884 (2023).
Google Scholar
Qing, L., Linhong, W. & Xuehai, D. A novel neural network-based method for medical text classification. Fut. Internet 11, 255 (2019).
Google Scholar
Nguyen, H. & Patrick, J. Text mining in clinical domain: Dealing with noise. In Proceedings of the 22nd Association for Computing Machinery Special Interest Group on Knowledge Discovery in Data International Conference on Knowledge Discovery and Data Mining, 549–558 (2016).
Abulibdeh, R. et al. Assessing the capture of sociodemographic information in electronic medical records to inform clinical decision making. PloS One 20, e0317599 (2025).
Google Scholar
Senior, M. et al. Identifying predictors of suicide in severe mental illness: A feasibility study of a clinical prediction rule (oxford mental illness and suicide tool or OxMIS). Front. Psychiatry 11, 268 (2020).
Google Scholar
Lybarger, K. et al. Leveraging natural language processing to augment structured social determinants of health data in the electronic health record. J. Am. Med. Inform. Assoc. 30, 1389–1397 (2023).
Google Scholar
Patra, B. G. et al. Extracting social determinants of health from electronic health records using natural language processing: A systematic review. J. Am. Med. Inform. Assoc. 28, 2716–2727 (2021).
Google Scholar
Bompelli, A. et al. Social and behavioral determinants of health in the era of artificial intelligence with electronic health records: A scoping review. Health Data Sci. 2021, 1–19 (2021).
Zhang, D., Thadajarassiri, J., Sen, C. & Rundensteiner, E. Time-aware transformer-based network for clinical notes series prediction. In Machine learning for Healthcare Conference, 566–588 (PMLR, 2020).
Yang, Z. et al. Hierarchical attention networks for document classification. In 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1480–1489 (2016).
Abulibdeh, R., Tu, K. & Sejdić, E. Natural language processing methods for assessing social determinants of health in the electronic health records: A narrative review. Expert Systems with Applications 127928 (2025).
Shi, J. et al. Accelerating clinical NLP at scale with a hybrid framework with reduced GPU demands: A case study in dementia identification. arXiv preprint arXiv:2504.12494 (2025).
Flaxman, A. D. & Vos, T. Machine learning in population health: Opportunities and threats. Public Library Sci. Med.15, e1002702 (2018).
Weissler, E. H. et al. The role of machine learning in clinical research: Transforming the future of evidence generation. BioMed. Cent. Trials 22, 1–15 (2021).
Google Scholar
Habehh, H. & Gohel, S. Machine learning in healthcare. Curr. Genomics 22, 291 (2021).
Google Scholar
Haider, S. A. et al. The algorithmic divide: A systematic review on AI-driven racial disparities in healthcare. J. Racial Ethnic Health Disparities 188–217 (2024).
Yu, Z. et al. Iguevara2024largedentifying social determinants of health from clinical narratives: A study of performance, documentation ratio, and potential bias. J. Biomed. Inform. 153, 104642 (2024l).
Google Scholar
Guevara, M. et al. Large language models to identify social determinants of health in electronic health records. NPJ Digital Med. 7, 6 (2024).
Google Scholar
Gao, Y., Sharma, T. & Cui, Y. Addressing the challenge of biomedical data inequality: An artificial intelligence perspective. Annu. Rev. Biomed. Data Sci. 6, 153–171 (2023).
Google Scholar
University of Toronto family medicine report. Tech. Rep., Department of Family and Community Medicine at the University of Toronto, Toronto, ON, Canada https://issuu.com/dfcm/docs/u_of_t_family_medicine_report (2019).
OntarioMD. Provincial EMR-integrated access https://www.ontariomd.ca/emr-certification/omd-certified-emrs-numbers/integrated-ehr-products (2025).
OntarioMD. From foundation to integration: Annual report 2016-2017. https://www.ontariomd.ca/documents/annual_report_2017.pd (2017).
Canadian Insitute for Health Information. Guidance on the use of standards for race-based and indigenous identity data collection and health reporting in canadas https://www.cihi.ca/en/race-based-and-indigenous-identity-data (2022).
Lybarger, K., Ostendorf, M. & Yetisgen, M. Annotating social determinants of health using active learning, and characterizing determinants using neural event extraction. J. Biomed. Inform. 113, 103631 (2021).
Google Scholar
Figueroa, R. L., Zeng-Treitler, Q., Ngo, L. H., Goryachev, S. & Wiechmann, E. P. Active learning for clinical text classification: Is it better than random sampling?. J. Am. Med. Inform. Assoc. 19, 809–816 (2012).
Google Scholar
Chen, Y., Lasko, T. A., Mei, Q., Denny, J. C. & Xu, H. A study of active learning methods for named entity recognition in clinical text. J. Biomed. Inform. 58, 11–18 (2015).
Google Scholar
Yang, Z., Dehmer, M., Yli-Harja, O. & Emmert-Streib, F. Combining deep learning with token selection for patient phenotyping from electronic health records. Sci. Rep. 10, 1432 (2020).
Google Scholar
Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A. & Talwalkar, A. Hyperband: A novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res. 18, 1–52 (2018).
Google Scholar
Caton, S. & Haas, C. Fairness in machine learning: A survey. Assoc. Comput. Mach. Comput. Surv. 56, 1–38 (2024).
Google Scholar
Han, J., Kamber, M. & Pei, J.Data Mining: Concepts and Techniques (Morgan Kaufmann, Boston, 2012), 3rd edn.
Hardt, M., Price, E. & Srebro, N. Equality of opportunity in supervised learning. Advances in Neural Information Processing Systems 29 (2016).
Khalili, M. M., Zhang, X. & Abroshan, M. Loss balancing for fair supervised learning. In International Conference on Machine Learning, 16271–16290 (PMLR, 2023).
Lai, Y. & Guan, L. Flexible fairness-aware learning via inverse conditional permutation. arXiv preprint arXiv:2404.05678 (2024).
Liu, M. et al. FAIM: Fairness-aware interpretable modeling for trustworthy machine learning in healthcare. Patterns 5, 101059 (2024).
Lee, G. & Sayer, S. Exploring equality: An investigation into custom loss functions for fairness definitions. arXiv preprint arXiv:2501.01889 (2025).
Stemerman, R. et al. Identification of social determinants of health using multi-label classification of electronic health record clinical notes. J. Am. Med. Inform. Assoc. Open 4, ooaa069 (2021).
Google Scholar
Grandini, M., Bagli, E. & Visani, G. Metrics for multi-class classification: An overview. arXiv preprint arXiv:2008.05756 (2020).
Shaphiro, S. & Wilk, M. An analysis of variance test for normality. Biometrika 52, 591–611 (1965).
Google Scholar
Scheffe, H.The analysis of variance, 72 (John Wiley & Sons, 1999).
Tukey, J. W. Comparing individual means in the analysis of variance. Biometrics 99–114 (1949).
Friedman, M. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32, 675–701 (1937).
Google Scholar
Nemenyi, P. B. Distribution-free multiple comparisons. (Princeton University, 1963).
Wilcoxon, F. Individual comparisons by ranking methods (Springer, New York, NY, USA, 1992).
Mann, H. B. & Whitney, D. R. On a test of whether one of two random variables is stochastically larger than the other. The Annals of Mathematical Statistics 50–60 (1947).
Kruskal, W. H. & Wallis, W. A. Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 47, 583–621 (1952).
Google Scholar
Pearson, K. X. on the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Lond., Edinb., Dublin Philos. Mag. J. Sci. 50, 157–175 (1900).
Google Scholar
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. Assoc. Comput. Linguistics, 4171–4186 (2019).
Liu, Y. et al. RoBERTa: A robustly optimized bert pretraining approach. Clinical Orthopaedics and Related Research (2019).
He, P., Liu, X., Gao, J. & Chen, W. DeBERTa: Decoding-enhanced BERT with disentangled attention. In International Conference on Learning Representations https://openreview.net/forum?id=XPZIaotutsD (2021).
Alsentzer, E. et al. Publicly available clinical BERT embeddings. presented at the Proceedings of the 2nd clinical natural language processing workshop (2019).
Gichoya, J. W. et al. AI recognition of patient race in medical imaging: A modelling study. Lancet Digital Health 4, e406–e414 (2022).
Google Scholar
Sun, M., Oliwa, T., Peek, M. E. & Tung, E. L. Negative patient descriptors: Documenting racial bias in the electronic health record. Health Aff. 41, 203–211 (2022).
Google Scholar
Wen, D. et al. Characteristics of publicly available skin cancer image datasets: A systematic review. Lancet Digital Health 4, e64–e74 (2022).
Google Scholar
Adam, H. et al. Write it like you see it: Detectable differences in clinical notes by race lead to differential model recommendations. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 7–21 (2022).
Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623 (2021).
Webster, K. et al. Measuring and reducing gendered correlations in pre-trained models. arXiv preprint arXiv:2010.06032 (2020).
Kaneko, M. & Bollegala, D. Unmasking the mask–evaluating social biases in masked language models. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, 11954–11962 (2022).
Gallifant, J. et al. Peer review of GPT-4 technical report and systems card. PLOS Digital Health 3, e0000417 (2024).
Google Scholar
Omiye, J. A., Lester, J. C., Spichak, S., Rotemberg, V. & Daneshjou, R. Large language models propagate race-based medicine. NPJ Digital Med. 6, 195 (2023).
Google Scholar
Zack, T. et al. Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: A model evaluation study. Lancet Digital Health 6, e12–e22 (2024).
Google Scholar
Labban, M. et al. Disparities in travel-related barriers to accessing health care from the 2017 national household travel survey. J. Am. Med. Assoc. Netw. Open 6, e2325291–e2325291 (2023).
Google Scholar
Yang, J., Soltan, A. A., Eyre, D. W., Yang, Y. & Clifton, D. A. An adversarial training framework for mitigating algorithmic biases in clinical machine learning. NPJ Digital Med. 6, 55 (2023).
Google Scholar
Tsai, T. C. et al. Algorithmic fairness in pandemic forecasting: Lessons from COVID-19. NPJ Digital Med. 5, 59 (2022).
Google Scholar
Dunkelau, J. & Leuschel, M. Fairness-aware machine learning. An Extensive Overview 1–60 (2019).
van de Sande, D., van Bommel, J., Fung Fen Chung, E., Gommers, D. & van Genderen, M. E. Algorithmic fairness audits in intensive care medicine: Artificial intelligence for all?. Crit. Care 26, 315 (2022).
Google Scholar
Liu, X. et al. The medical algorithmic audit. Lancet Digital Health 4, e384–e397 (2022).
Google Scholar
Hassija, V. et al. Interpreting black-box models: A review on explainable artificial intelligence. Cogn. Comput. 16, 45–74 (2024).
Google Scholar
Nizam, T. & Zafar, S. Explainable artificial intelligence (XAI): Conception, visualization and assessment approaches towards amenable XAI. In Explainable Edge AI: A Futuristic Computing Perspective, 35–51 (Springer, 2022).
Ghai, B. & Mueller, K. D-bias: A causality-based human-in-the-loop system for tackling algorithmic bias. IEEE Trans. Vis. Comput. Graph. 29, 473–482 (2022).
Google Scholar
Albert, S. M. et al. Do patients want clinicians to ask about social needs and include this information in their medical record?. BMC Health Serv. Res. 22, 1275 (2022).
Google Scholar
Yelton, B. et al. Assessment and documentation of social determinants of health among health care providers: Qualitative study. J. Med. Internet Res. Formative Res. 7, e47461 (2023).
Google Scholar
Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems 32 (2019).
Wolf, T. et al. Huggingface’s transformers: State-of-the-art natural language processing. in Proceedings of the 2020 Conference on Empirical Methods in NLP: System Demonstrations, 38–45 (2020).

Download references

Acknowledgements

This work was supported by the Canadian Institutes of Health Research [grant number 173094]. Dr. K Tu receives a Chair in Family and Community Medicine Research in Primary Care at UHN and a Research Scholar Award from the Department of Family and Community Medicine, Temerty Faculty of Medicine, University of Toronto. Dr. L Celi is funded by the National Institute of Health through DS-I Africa U54 TW012043-01 and Bridge2AI OT2OD032701, and the National Science Foundation through ITEST #2148451.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada
Rawan Abulibdeh, Yihang Lin, Sepehr Ahmadi, Ervin Sejdić & Qiuyi Zhao
Neurosciences and Mental Health Research Program, The Hospital for Sick Children, Toronto, ON, Canada
Sepehr Ahmadi
North York General Hospital, North York, ON, Canada
Ervin Sejdić & Karen Tu
Laboratory for Computational Physiology, MAssachusetts Institute of Technology, Cambridge, MA, USA
Leo Anthony Celi
Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
Leo Anthony Celi
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Leo Anthony Celi
Department of Family and Community Medicine, University of Toronto, Toronto, ON, Canada
Karen Tu
Toronto Western Hospital Family Health Team, University Health Network, Toronto, ON, Canada
Karen Tu

Authors

Rawan Abulibdeh
View author publications
Search author on:PubMed Google Scholar
Yihang Lin
View author publications
Search author on:PubMed Google Scholar
Sepehr Ahmadi
View author publications
Search author on:PubMed Google Scholar
Ervin Sejdić
View author publications
Search author on:PubMed Google Scholar
Leo Anthony Celi
View author publications
Search author on:PubMed Google Scholar
Qiuyi Zhao
View author publications
Search author on:PubMed Google Scholar
Karen Tu
View author publications
Search author on:PubMed Google Scholar

Contributions

K.T. and E.S. conceived the study. R.A. designed and conducted the study, developed and implemented the models, collected and processed the data, performed model and bias analyses, and drafted the manuscript. K.T. and E.S. supervised the study, provided resources, assisted in manuscript editing and review, and contributed to project administration. K.T. additionally curated data, and secured funding. Y.L. contributed to the conceptualization and development of the hierarchical CNN model and the active learning model. Y.L. also provided input on the methodology and interpretation of results. S.A. developed the active learning model, performed its analysis, and generated results. S.A. also assisted in drafting portions of the manuscript. L.A.C. contributed to the interpretation of findings, assisted in drafting the discussion and future directions, and provided critical feedback on the manuscript. Q.Z. provided support for data analysis and interpretation of results. All authors—R.A., Y.L., S.A., K.T., L.A.C., Q.Z., and E.S.—reviewed and approved the final manuscript.

Corresponding author

Correspondence to Ervin Sejdić.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Medicine thanks Brandon Theodorou and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Transparent Peer Review file

Supplemental Information

Description of Additional Supplementary files

Supplementary Data 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Abulibdeh, R., Lin, Y., Ahmadi, S. et al. Integration of fairness-awareness into clinical language processing models. Commun Med (2026). https://doi.org/10.1038/s43856-026-01433-9

Download citation

Received: 09 July 2025
Accepted: 03 February 2026
Published: 24 February 2026
DOI: https://doi.org/10.1038/s43856-026-01433-9