Abstract
Until now, much of the work on machine learning and health has focused on processes inside the hospital or clinic. However, this represents only a narrow set of tasks and challenges related to health; there is greater potential for impact by leveraging machine learning in health tasks more broadly. In this Perspective we aim to highlight potential opportunities and challenges for machine learning within a holistic view of health and its influences. To do so, we build on research in population and public health that focuses on the mechanisms between different cultural, social and environmental factors and their effect on the health of individuals and communities. We present a brief introduction to research in these fields, data sources and types of tasks, and use these to identify settings where machine learning is relevant and can contribute to new knowledge. Given the key foci of health equity and disparities within public and population health, we juxtapose these topics with the machine learning subfield of algorithmic fairness to highlight specific opportunities where machine learning, public and population health may synergize to achieve health equity.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
References
Rose, G. Sick individuals and sick populations. Int. J. Epidemiol. 14, 427–432 (1985).
Braveman, P. Health disparities and health equity: concepts and measurement. Annu. Rev. Public Health 27, 167–194 (2006).
Woolf, S. H., Johnson, R. E., Fryer Jr, G. E., Rust, G. & Satcher, D. The health impact of resolving racial disparities: an analysis of US mortality data. Am. J. Public Health 94, 2078–2081 (2004).
Bronfenbrenner, U. Toward an experimental ecology of human development. Am. Psychol. 32, 513 (1977).
Veinot, T. C., Mitchell, H. & Ancker, J. S. Good intentions are not enough: how informatics interventions can worsen inequality. J. Am. Med. Inform. Assoc. 25, 1080–1088 (2018).
Barrientos-Gutierrez, T. et al. Neighborhood physical environment and changes in body mass index: results from the multi-ethnic study of atherosclerosis. Am. J. Epidemiol. 186, 1237–1245 (2017).
Creanga, A. A. et al. Maternal mortality and morbidity in the United States: where are we now? J. Women’s Health 23, 3–9 (2014).
Social Determinants of Health (WHO Regional Office for South-East Asia, 2008).
Heiman, H. J. & Artiga, S. Beyond health care: the role of social determinants in promoting health and health equity. Health 20, 1–10 (2015).
2008-2013 Action Plan for the Global Strategy for the Prevention and Control of Noncommunicable Diseases: Prevent and Control Cardiovascular Diseases, Cancers, Chronic Respiratory Diseases and Diabetes (World Health Organization, 2009).
Saria, S., Rajani, A. K., Gould, J., Koller, D. & Penn, A. A. Integration of early physiological responses predicts later illness severity in preterm infants. Sci. Transl. Med. 2, 48ra65 (2010).
Sweatt, A. J. et al. Discovery of distinct immune phenotypes using machine learning in pulmonary arterial hypertension. Circ. Res. 124, 904–919 (2019).
Gatto, M. et al. Spread and dynamics of the COVID-19 epidemic in Italy: effects of emergency containment measures. Proc. Natl Acad. Sci. USA 117, 10484–10491 (2020).
Smit, A. J. et al. Winter is coming: a southern hemisphere perspective of the environmental drivers of SARS-CoV-2 and the potential seasonality of COVID-19. Int. J. Environ. Res. Public Health 17, 5634 (2020).
Sajadi, M. M. et al. Temperature, humidity, and latitude analysis to estimate potential spread and seasonality of coronavirus disease 2019 (COVID-19). JAMA Netw. Open 3, e2011834 (2020).
Chaudhry, R., Dranitsaris, G., Mubashir, T., Bartoszko, J. & Riazi, S. A country level analysis measuring the impact of government actions, country preparedness and socioeconomic factors on COVID-19 mortality and related health outcomes. EClinicalMedicine 25, 100464 (2020).
Bann, D. et al. Changes in the behavioural determinants of health during the COVID-19 pandemic: gender, socioeconomic and ethnic inequalities in five British cohort studies. J. Epidemiol. Commun. Health https://doi.org/10.1136/jech-2020-215664 (2021).
Laurencin, C. T. & McClinton, A. The COVID-19 pandemic: a call to action to identify and address racial and ethnic disparities. J. Racial Ethnic Health Dispar. 7, 398–402 (2020).
Abedi, V. et al. Racial, economic, and health inequality and COVID-19 infection in the United States. J. Racial Ethnic Health Dispar. 8, 732–742 (2021).
Chunara, R., Smolinski, M. S. & Brownstein, J. S. Why we need crowdsourced data in infectious disease surveillance. Curr. Infect. Dis. Rep. 15, 316–319 (2013).
Kusnoor, S. V. et al. Collection of social determinants of health in the community clinic setting: a cross-sectional study. BMC Public Health 18, 550 (2018).
Chunara, R., Wisk, L. E. & Weitzman, E. R. Denominator issues for personally generated data in population health monitoring. Am. J. Prevent. Med. 52, 549–553 (2017).
Mhasawade, V., Elghafari, A., Duncan, D. T. & Chunara, R. Role of the built and online social environments on expression of dining on instagram. Int. J. Environ. Res. Public Health 17, 735 (2020).
Zhan, A. et al. Using smartphones and machine learning to quantify Parkinson disease severity: the mobile Parkinson disease score. JAMA Neurol. 75, 876–880 (2018).
Mhasawade, V., Rehman, N. A. & Chunara, R. Population-aware hierarchical Bayesian domain adaptation via multi-component invariant learning. In Proc. ACM Conference on Health, Inference, and Learning 182–192 (ACM, 2020).
Burgess, S., Foley, C. N. & Zuber, V. Inferring causal relationships between risk factors and outcomes from genome-wide association study data. Annu. Rev. Genom. Hum. Genet. 19, 303–327 (2018).
Bhatt, S. et al. The global distribution and burden of dengue. Nature 496, 504–507 (2013).
Zhao, Y. et al. Machine learning for integrating social determinants in cardiovascular disease prediction models: a systematic review. Preprint at medRxiv https://doi.org/10.1101/2020.09.11.20192989 (2020).
Goldberg, D. S. Social justice, health inequalities and methodological individualism in US health promotion. Public Health Ethics 5, 104–115 (2012).
Burns, M. N. et al. Harnessing context sensing to develop a mobile intervention for depression. J. Med. Internet Res. 13, e55 (2011).
Manuvinakurike, R., Velicer, W. F. & Bickmore, T. W. Automated indexing of internet stories for health behavior change: weight loss attitude pilot study. J. Med. Internet Res. 16, e285 (2014).
Ahsan, G. T. et al. Toward an mhealth intervention for smoking cessation. In Proc. 2013 IEEE 37th Annual Computer Software and Applications Conference Workshops 345–350 (IEEE, 2013).
Triantafyllidis, A. K. & Tsanas, A. Applications of machine learning in real-life digital health interventions: review of the literature. J. Med. Internet Res. 21, e12286 (2019).
Mahamoud, A., Roche, B. & Homer, J. Modelling the social determinants of health and simulating short-term and long-term intervention impacts for the city of Toronto, Canada. Soc. Sci. Med. 93, 247–255 (2013).
Kouser, H. N., Barnard-Mayers, R. & Murray, E. Complex systems models for causal inference in social epidemiology. J. Epidemiol. Commun. Health 75, 702–708 (2021).
Rajkomar, A. et al. Scalable and accurate deep learning with electronic health records. npj Digit. Med. 1, 18 (2018).
Shameer, K. et al. Predictive modeling of hospital readmission rates using electronic medical record-wide machine learning: a case-study using Mount Sinai heart failure cohort. In Pacific Symposium on Biocomputing 2017 276–287 (World Scientific, 2017).
Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).
Bhatt, S. et al. Improved prediction accuracy for disease risk mapping using Gaussian process stacked generalization. J. R. Soc. Interface 14, 20170520 (2017).
Galiatsatos, P. et al. The association between neighborhood socioeconomic disadvantage and readmissions for patients hospitalized with sepsis. In C94: The Impact of Social Determinants in Pulmonary and Critical Care A5569 (American Thoracic Society, 2019).
Vyas, D. A., Eisenstein, L. G. & Jones, D. S. Hidden in plain sight-reconsidering the use of race correction in clinical algorithms. N. Engl. J. Med. 383, 874–882 (2020).
Hamad, R., Nguyen, T. T., Bhattacharya, J., Glymour, M. M. & Rehkopf, D. H. Educational attainment and cardiovascular disease in the united states: a quasi-experimental instrumental variables analysis. PLoS Med. 16, e1002834 (2019).
Bynum, J. & Lewis, V. Value-based payments and inaccurate risk adjustment-who is harmed? JAMA Intern. Med. 178, 1507–1508 (2018).
Alley, D. E., Asomugha, C. N., Conway, P. H. & Sanghavi, D. M. et al. Accountable health communities-addressing social needs through medicare and medicaid. N. Engl. J. Med 374, 8–11 (2016).
Alaa, A. M. & van der Schaar, M. in Advances in Neural Information Processing Systems Vol. 30 (eds Guyon, I. et al.) (NeurIPS, 2017).
Chang, C.-H., Mai, M. & Goldenberg, A. Dynamic measurement scheduling for event forecasting using deep RL. In International Conference on Machine Learning 951–960 (PMLR, 2019).
Coughlin, L. N. et al. Developing an adaptive mobile intervention to address risky substance use among adolescents and emerging adults: usability study. JMIR mHealth uHealth 9, e24424 (2021).
Snyder, J. J. et al. Organ distribution without geographic boundaries: a possible framework for organ allocation. Am. J. Transplant. 18, 2635–2640 (2018).
Mantelero, A. in Group Privacy 139–158 (Springer, 2017).
Gasser, U., Ienca, M., Scheibner, J., Sleigh, J. & Vayena, E. Digital tools against COVID-19: taxonomy, ethical challenges, and navigation aid. Lancet Digit. Health 2, e425–e434 (2020).
Jobin, A., Ienca, M. & Vayena, E. The global landscape of AI ethics guidelines. Nat. Mach. Intell. 1, 389–399 (2019).
Privacy and the COVID-19 Outbreak (Office of the Privacy Commissioner of Canada, 2020); https://priv.gc.ca/en/privacy-topics/health-genetic-and-other-body-information/health-emergencies/gd_covid_202003/
Langarizadeh, M., Orooji, A., Sheikhtaheri, A. & Hayn, D. Effectiveness of anonymization methods in preserving patients’ privacy: a systematic literature review. eHealth 80–87 (2018).
Smith, M., Szongott, C., Henne, B. & Von Voigt, G. Big data privacy issues in public social media. In Proc. 2012 6th IEEE International Conference on Digital Ecosystems and Technologies (DEST) 1–6 (IEEE, 2012).
Yearby, R. Structural racism and health disparities: reconfiguring the social determinants of health framework to include the root cause. J. Law Med. Ethics 48, 518–526 (2020).
Fiesler, C. & Proferes, N. ‘Participant’ perceptions of Twitter research ethics. Soc. Media Soc. 4, 2056305118763366 (2018).
Sandhaus, S., Kaufmann, D. & Ramirez-Andreotta, M. Public participation, trust and data sharing: gardens as hubs for citizen science and environmental health literacy efforts. Int. J. Sci. Educ. B 9, 54–71 (2019).
Chunara, R. & Cook, S. H. Using digital data to protect and promote the most vulnerable in the fight against COVID-19. Front. Public Health 8, 296 (2020).
Liu, X., Zhang, B., Susarla, A. & Padman, R. Youtube for patient education: a deep learning approach for understanding medical knowledge from user-generated videos. Preprint at https://arxiv.org/abs/1807.03179 (2018).
Dawkins-Moultin, L., McDonald, A. & McKyer, L. Integrating the principles of socioecology and critical pedagogy for health promotion health literacy interventions. J. Health Commun. 21, 30–35 (2016).
Hong, S. J., Drake, B., Goodman, M. & Kaphingst, K. A. Race, trust in doctors, privacy concerns, and consent preferences for biobanks. Health Commun. 35, 1219–1228 (2020).
Tanner, M. A. & Wong, W. H. The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82, 528–540 (1987).
Daughton, A. R., Chunara, R. & Paul, M. J. Comparison of social media, syndromic surveillance, and microbiologic acute respiratory infection data: observational study. JMIR Public Health Surveill. 6, e14986 (2020).
Sun, B., Feng, J. & Saenko, K. Return of frustratingly easy domain adaptation. In Proc. AAAI Conference on Artificial Intelligence Vol. 30 (AAAI, 2016).
Pearl, J. & Bareinboim, E. Transportability of causal and statistical relations: a formal approach. In Proc. AAAI Conference on Artificial Intelligence Vol. 25 (AAAI, 2011).
Scepanovic, S., Martin-Lopez, E., Quercia, D. & Baykaner, K. Extracting medical entities from social media. In Proc. ACM Conference on Health, Inference, and Learning 170–181 (ACM, 2020).
Abdur Rehman, N., Saif, U. & Chunara, R. Deep landscape features for improving vector-borne disease prediction. In Proc. IEEE Conference on Computer Vision and Pattern Recognition Workshops 44–51 (IEEE, 2019).
Relia, K., Akbari, M., Duncan, D. & Chunara, R. Socio-spatial self-organizing maps: using social media to assess relevant geographies for exposure to social processes. Proc. ACM Hum.Comput. Interact. 2, 1–23 (2018).
Relia, K., Li, Z., Cook, S. H. & Chunara, R. Race, ethnicity and national origin-based discrimination in social media and hate crimes across 100 US cities. In Proc. International AAAI Conference on Web and Social Media Vol. 13, 417–427 (AAAI, 2019).
Harper, S., Lynch, J. & Smith, G. D. Social determinants and the decline of cardiovascular diseases: understanding the links. Annu. Rev. Public Health 32, 39–69 (2011).
Marmot, M. Social justice, epidemiology and health inequalities. Eur. J. Epidemiol. 32, 537–546 (2017).
Akbar, M. & Chunara, R. Using contextual information to improve blood glucose prediction. In Proc. Machine Learning Research Vol. 106, 91–108 (PMLR, 2019); http://proceedings.mlr.press/v106/akbar19a.html
Quisel, T., Kale, D. C. & Foschini, L. Intra-day activity better predicts chronic conditions. Preprint at https://arxiv.org/abs/1612.01200 (2016).
Glymour, C. & Glymour, M. R. Commentary: race and sex are causes. Epidemiology 25, 488–490 (2014).
Bauman, A. E., Sallis, J. F., Dzewaltowski, D. A. & Owen, N. Toward a better understanding of the influences on physical activity: the role of determinants, correlates, causal variables, mediators, moderators, and confounders. Am. J. Prevent. Med. 23, 5–14 (2002).
Verma, S. & Rubin, J. Fairness definitions explained. In 2018 IEEE/ACM International Workshop on Software Fairness (Fairware) 1–7 (IEEE, 2018).
McCradden, M. D., Joshi, S., Mazwi, M. & Anderson, J. A. Ethical limitations of algorithmic fairness solutions in health care machine learning. Lancet Digit. Health 2, e221–e223 (2020).
Chen, I. Y., Agrawal, M., Horng, S. & Sontag, D. Robustly extracting medical knowledge from EHRS: a case study of learning a health knowledge graph. In Pacific Symposium on Biocomputing 2020 19–30 (World Scientific, 2020).
Obermeyer, Z. & Mullainathan, S. Dissecting racial bias in an algorithm that guides health decisions for 70 million people. In Proc. Conference on Fairness, Accountability and Transparency 89 (ACM, 2019).
Braveman, P. A., Egerter, S. A., Cubbin, C. & Marchi, K. S. An approach to studying social disparities in health and health care. Am. J. Public Health 94, 2139–2148 (2004).
Penman-Aguilar, A. et al. Measurement of health disparities, health inequities, and social determinants of health to support the advancement of health equity. J. Public Health Manag. Pract. 22, S33 (2016).
Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G. & Chin, M. H. Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 169, 866–872 (2018).
Tichenor, M. & Sridhar, D. Metric partnerships: global burden of disease estimates within the World Bank, the World Health Organisation and the Institute for Health Metrics and Evaluation. Wellcome Open Res. 4, 35 (2019).
Buolamwini, J. & Gebru, T. Gender shades: intersectional accuracy disparities in commercial gender classification. In Conference on Fairness, Accountability and Transparency 77–91 (PMLR, 2018).
Agarwal, C. & Hooker, S. Estimating example difficulty using variance of gradients. Preprint at https://arxiv.org/abs/2008.11600 (2020).
Hooker, S., Moorosi, N., Clark, G., Bengio, S. & Denton, E. Characterising bias in compressed models. Preprint at https://arxiv.org/abs/2010.03058 (2020).
Suresh, H. & Guttag, J. V. A framework for understanding unintended consequences of machine learning. Preprint at https://arxiv.org/abs/1901.10002 (2019).
Krieger, N. Refiguring ‘race’: epidemiology, racialized biology, and biological expressions of race relations. Int. J. Health Serv. 30, 211–216 (2000).
Bonham, V. L., Green, E. D. & Pérez-Stable, E. J. Examining how race, ethnicity, and ancestry data are used in biomedical research. JAMA 320, 1533–1534 (2018).
Crenshaw, K. Demarginalizing the intersection of race and sex: a black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. Univ. Chicago Legal Forum 139–167 (1989).
Morris, J. N. Uses of epidemiology. Br. Med. J. 2, 395 (1955).
Evans, C. R., Williams, D. R., Onnela, J.-P. & Subramanian, S. A multilevel approach to modeling health inequalities at the intersection of multiple social identities. Soc. Sci. Med. 203, 64–73 (2018).
Benjamin, R. Race After Technology: Abolitionist Tools for the New Jim Code (Polity Press, 2019).
Mitchell, S., Potash, E., Barocas, S., D’Amour, A. & Lum, K. Algorithmic fairness: choices, assumptions, and definitions. Annu. Rev. Stat. Appl. 8, 141–163 (2021).
VanderWeele, T. J. & Robinson, W. R. On causal interpretation of race in regressions adjusting for confounding and mediating variables. Epidemiology 25, 473 (2014).
Diez-Roux, A. V. Bringing context back into epidemiology: variables and fallacies in multilevel analysis. Am. J. Public Health 88, 216–222 (1998).
Mhasawade, V. & Chunara, R. Causal multi-level fairness. Preprint at https://arxiv.org/abs/2010.07343 (2020).
Card, D. E. et al. The Impact of Health Insurance Status on Treatment Intensity and Health Outcomes (RAND, 2007).
Pearl, J. & Bareinboim, E. External validity: from do-calculus to transportability across populations. Stat. Sci. 29, 579–595 (2014).
Mitchell, S., Potash, E., Barocas, S., D’Amour, A. & Lum, K. Prediction-based decisions and fairness: a catalogue of choices, assumptions, and definitions. Preprint at https://arxiv.org/abs/1811.07867 (2018).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Machine Intelligence thanks Melissa Mccradden, Marcello Ienca and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mhasawade, V., Zhao, Y. & Chunara, R. Machine learning and algorithmic fairness in public and population health. Nat Mach Intell 3, 659–666 (2021). https://doi.org/10.1038/s42256-021-00373-4
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s42256-021-00373-4
This article is cited by
-
Sex disparities in deep learning estimation of ejection fraction from cardiac magnetic resonance imaging
npj Digital Medicine (2026)
-
Reducing inequalities using an unbiased machine learning approach to identify births with the highest risk of preventable neonatal deaths
Population Health Metrics (2025)
-
Spatiotemporal analysis and risk prediction of foodborne diseases based on meteorological risk factors: a case study of Wuxi city, China
BMC Infectious Diseases (2025)
-
Predictive estimations of health systems resilience using machine learning
BMC Medical Informatics and Decision Making (2025)
-
Towards fairness-aware and privacy-preserving enhanced collaborative learning for healthcare
Nature Communications (2025)


