Predicting the effects of temperature variability on nutritional status of children under five in Sub-Saharan Africa using machine learning

Bachwenkizi, Jovine; He, Cheng; Zhu, Yixiang; Mugisha, Alice; Tlou, Boikhutso; Moshiro, Candida; Mwambi, Henry; Madzorera, Isabel; Chen, Renjie; Kan, Haidong; Fawzi, Wafaie W.

doi:10.1038/s41598-026-39659-1

Download PDF

Article
Open access
Published: 10 February 2026

Predicting the effects of temperature variability on nutritional status of children under five in Sub-Saharan Africa using machine learning

Jovine Bachwenkizi^1,8,
Cheng He¹,
Yixiang Zhu²,
Alice Mugisha^1,3,
Boikhutso Tlou^1,4,
Candida Moshiro⁵,
Henry Mwambi⁶,
Isabel Madzorera⁷,
Renjie Chen²,
Haidong Kan^2,9 &
…
Wafaie W. Fawzi^1,10,11

Scientific Reports volume 16, Article number: 8055 (2026) Cite this article

1522 Accesses
2 Citations
Metrics details

Subjects

Abstract

Rising temperatures due to climate change pose significant risks to the nutritional status of under-five children, particularly in Sub-Saharan Africa (SSA). This study investigates the influence of temperature increases on nutritional status (wasting, stunting, and underweight) in SSA. Based on Demographic and Health Survey (DHS) data for under-five children and global meteorological reanalysis data, we employed multiple supervised machine learning methods to predict the impact of temperature variability on nutritional status indicators, including stunting, underweight, and wasting, while controlling for socioeconomic variables such as household income and maternal education. Different metrics were used to evaluate the forecasting performance. In addition, multivariable logistic regression was employed to test for the causal-effect relationship. A total of 345,837 participants from 22 SSA countries were analyzed using data from 2005 to 2023. Among the algorithms tested, XG Boost achieved the highest accuracy for underweight prediction (Accuracy = 0.7832), Random Forest for stunting (Accuracy = 0.7023), and logistic regression for wasting (Accuracy = 0.6634). For different countries, accuracies ranging from 0.65 to 0.90, with highest in Uganda (decision tree, Accuracy = 0.9042 for stunting) and lowest in Burundi (XG Boost, Accuracy = 0.6426 for wasting). Causal-effect analysis revealed that each 1 °C rise in average temperature increased the odds of stunting by approximately 1% (OR 1.01, 95% CI: 1.00–1.10), underweight by about 3% (OR 1.03, 95% CI: 1.01–1.06), and wasting by around 10% (OR 1.10, 95% CI: 1.08–1.12). Although the incremental increases per degree appear modest, such temperature-related risks may translate into substantial population-level impacts in climate-vulnerable settings. Higher household income and maternal education were associated with improved nutritional outcomes and attenuated the adverse effects of rising temperatures, indicating a protective socioeconomic effect. Supervised machine learning models can effectively leverage complex datasets to predict the impact of temperature variability on nutritional status, reinforcing the importance of integrated policies and climate-smart agricultural practices for safeguarding the health of under-five children in SSA.

Introduction

The impacts of climate change have become increasingly evident across the globe, particularly in developing regions like sub-Saharan Africa (SSA), where populations are highly vulnerable to climatic shifts. As global temperatures rise, projections indicate that food production will decline, exacerbating the risk of malnutrition, particularly among vulnerable groups such as children, women, and the elderly^1,2,3,4.

The nutritional status (Underweight, Wasting and Stunting) of under five children is influenced by various environmental, social, and economic factors, with climate change playing a critical role⁵. In SSA, where the majority of the population relies on subsistence farming and natural resources for their livelihoods, even small changes in temperature can disrupt food production cycles and contribute to food shortages^6,7,8. The direct and indirect consequences of climate change, including reduced agricultural productivity, increased food prices, and the degradation of food quality, are expected to have profound effects on nutritional outcomes in this region.

Research indicates that temperature increases can adversely affect crop yields through various mechanisms, including heat stress, changes in precipitation patterns, and increased pest and disease pressure^9,10. For instance, studies have shown that staple crops like maize, which is a dietary staple in many SSA countries, exhibit reduced yields with even modest increases in temperature^11,12.This decline in agricultural output not only reduces food availability but can also lead to increased food prices, further limiting access to essential nutrients for low-income households. Beyond direct effects on food production, rising temperatures can indirectly influence nutritional status by affecting water quality and availability, which are crucial for both food production and human health^13,14. This relationship between climatic factors and nutrition underscores the urgent need to understand how rising temperatures impact food systems and nutritional health in SSA.

While much research has focused on the effects of climate change on food system, fewer studies have employed advanced analytical techniques, such as machine learning, to explore the implications for nutritional outcomes¹⁵. Climate change poses a growing threat to food systems in SSA by disrupting agricultural productivity, food availability, and dietary diversity. These disruptions heighten the risk of child undernutrition, particularly stunting, wasting, and underweight, which remain persistent public health challenges in the region. Understanding these complex and multi-dimensional relationships requires analytical approaches capable of capturing non-linear and interacting effects. Machine-learning methods offer a powerful complement to traditional regression models by identifying hidden patterns and heterogeneous vulnerabilities that shape how climate stressors influence nutritional outcomes^16,17,18,19. With sufficient datasets on temperature patterns from different locations worldwide collected by the satellite, and health indicators from Demographic and Health Surveys (DHS), machine learning can identify trends and correlations that may not be immediately apparent through traditional statistical methods. Machine learning models can uncover complex patterns and interactions within large datasets that traditional statistical approaches may overlook. Through combining data on temperature changes, agricultural yields, and nutritional indicators, machine learning can facilitate a more nuanced understanding of the potential consequences of rising temperatures on nutritional outcomes²⁰.

Rising temperatures may compromise the nutritional status of children under five in SSA. We hypothesize that each 1 °C increase in mean temperature will reduce dietary diversity and increase the risks of stunting, wasting, and underweight. Using supervised machine-learning, we aim to uncover non-linear associations and interactions across socioeconomic gradients that conventional regression models may miss. This approach will reveal heterogeneous vulnerability patterns and identify sub-populations most at risk, providing critical evidence to inform targeted interventions against climate-driven nutritional deficits.

This study aims to predict the effects of rising temperatures as a climatic factor on nutritional status in SSA using both empirical and machine learning approaches. The findings of this research provide valuable insights for policymakers, healthcare providers, and agricultural experts working to mitigate the health impacts of climate change and support sustainable development in the region.

Results

Descriptive data

The study included data from DHS with under-five children across 22 SSA countries. The average mother age was 29 years with 39.1% lack education and 34.6% with primary education and most participants resided in rural areas with 69.9%. Nutritional status indicators revealed that 33.6% of the children were stunted, 16.6% were underweight, and 7.8% were wasted, indicating a significant prevalence of malnutrition in the population. The demographic characteristics of the study population are summarized in Table 1.

Table 1 Descriptive statistics of the study participants (N = 345837).

Full size table

Meteorological data

The analysis of meteorological data from the ERA5 dataset showed a consistent increase in average temperatures over the past 18 years. The average temperature during the study period was 23.56 °C, with variability in temperature throughout the year Analysis of temperature records indicates that the years 2006 and 2023 exhibited elevated average temperatures across 22 African countries, providing further evidence of the ongoing impact of climate change on regional climatic conditions as indicated in Fig. 1.

Machine learning model performance

Machine learning models were developed to predict the effects of rising temperatures on nutritional status indicators. A total of 6 models were evaluated, including, Random Forest, Support Vector Machine, Logistic Regression, XGBoost, and Decision tree, with cross-validation applied to assess model performance. The best-performing model achieved an accuracy of 0.7832, indicating a strong predictive capability of underweight. The performance of the Random Forest classifier for stunting prediction outperformed all other models. The model achieved an accuracy of 0.7023, a recall of 0.6605, and a precision of 0.6801 as indicated in Table 2. Additionally, the Area Under the ROC Curve (AUC) was 0.7207 for Random Forest to predict stunting, indicating good overall model discrimination. A ROC curve was generated to visually assess the classification performance as shown in Fig. 2.

Table 2 Evaluation metric to predict nutritional status across all 22 countries.

Full size table

Country-specific machine learning model performance

We trained and evaluated predictive models for outcomes Stunting, Underweight, and Wasting separately for each country with sufficient data collected between 2015 and 2023 in their respective DHS. Performance metrics, including accuracy, area under the ROC curve (AUC), precision, recall, and F1-score, varied considerably across countries. For instance, in Nigeria, the XGBoost model achieved the highest accuracy 0.9002 and AUC (0.8049) for predicting Stunting, indicating strong predictive power. In contrast, Burundi demonstrated lower performance metrics across all outcomes, potentially reflecting greater heterogeneity or limited sample size as shown in Table 3.

Table 3 Country-specific machine learning model performance.

Full size table

Across all countries, predictive performance was consistently higher for stunting than for wasting or underweight, indicating that stunting was more effectively captured by the available predictors. The application of SMOTE balanced class distributions and led to measurable improvements in recall and F1-scores. Overall, model performance achieved reasonable accuracy in several countries but varied across settings, likely reflecting differences in data quality, sample size, and feature relevance.

Causal–effect relationship

The regression results in Fig. 3 revealed a statistically significant negative relationship between rising temperatures and nutritional status indicators. Specifically, for every 1 °C increase in average temperature, the likelihood of stunting increased by increased with the odds ratio (OR) of 1.01 (95% CI 1.00, 1.10), underweight was 1.03 (95% CI 1.01, 1.06), and wasting was 1.10 (95% CI 1.08, 1.12), and their p-Value were < 0.005 respectively. These findings support the hypothesis that rising temperatures adversely affect the nutritional status of under-five children in SSA.

Further analysis revealed that socioeconomic factors, such as household income and maternal education, also significantly influenced nutritional outcomes. Higher education of the mother was associated with lower rates of stunting odds ratio (OR) of 0.86 (95%CI; 0.82, 0.96), while determinants of household Wealth Index such as access to toilet, access to safe water, and use of clean fuel for cooking correlated with improved nutritional status among children as indicated in Table S2 of the Supplementary materials. Interaction terms between temperature and socioeconomic factors were included in the models, revealing that vulnerable populations are disproportionately affected by rising temperatures.

Country specific analysis for the causal–effect relationship

The country-specific causal analysis demonstrated that rising average temperature is significantly associated with increased risk of childhood stunting, underweight, and wasting, though the strength of these associations varied across the 22 countries studied. For example, in Burkina Faso, a 1 °C increase in temperature was associated with increased odds of wasting (OR: 1.42, 95% CI: 1.36–1.52) and underweight (OR: 1.11, 95% CI: 1.10–1.12). In Ethiopia, a similar temperature rise was linked to higher odds of wasting (OR: 1.11, 95% CI: 1.06–1.16). Sierra Leone also exhibited strong associations, with the highest odds ratios observed for stunting (OR: 1.08, 95% CI: 1.03–1.14). In contrast, some countries, such as Gabon and Kenya, showed smaller or non-significant associations (e.g., Kenya, stunting OR: 1.02, 95% CI: 0.98–1.06), suggesting possible mitigating effects from local interventions or socio-economic factors. While nearly all countries exhibited a positive relationship between temperature rise and adverse nutritional outcomes, the magnitude was context-dependent as indicated in Fig. 4. These findings emphasize the need for tailored, data-driven interventions at the country level, factoring in the heterogeneous impacts of climate variability on child nutrition across SSA.

Discussion

Our study provides a comprehensive evaluation of multiple Machine Learning algorithms, including Random Forests, Support Vector Machine (SVM), K-Nearest Neighbors, XG Boost, Decision Trees, and Logistic Regression, to predict key nutritional outcomes (stunting, underweight, and wasting) among children under five across 22 SSA countries. In addition, causal-effects relationship was performed to assess the association between annually temperature variability and nutritional status. We integrated demographic, economic, health, and weather data, to identify robust, generalizable models to inform targeted policy and intervention strategies for malnutrition reduction in SSA.

Our evaluation of multiple machine learning algorithms demonstrated that ensemble-based approaches, particularly XG Boost and Random Forest, provided superior predictive performance for childhood nutritional status across 22 SSA countries, especially in relation to the influence of rising temperatures. Consistent with prior research showing the advantages of ensemble methods for complex health prediction tasks^15,19,21,22, XG Boost achieved the highest overall accuracy for underweight (0.7832), while Random Forest attained top performance for stunting prediction (accuracy = 0.7023; recall = 0.6605; precision = 0.6801; AUC = 0.7207). These results are in line with findings by Khudri et al. (2023)¹⁵ and Talukder et al. (2020)²³, who reported increased accuracy and stability of supervised Machine Learning models in nutritional and health data contexts relative to traditional logistic regression. While model performance for all algorithms was generally robust for underweight and stunting, the predictive accuracy for wasting remained lower, reflecting well-documented challenges in capturing acute and often rapidly changing health outcomes^19,24,25. Our use of cross-validation across the multi-country dataset further reinforces the generalizability and reliability of these results, overcoming issues of overfitting and supporting the models’ suitability for large-scale nutritional surveillance^25,26. The ROC curves generated for our best-performing models confirmed good discrimination ability and support the implementation of machine learning techniques for timely identification of at-risk populations. However, as in previous studies, model accuracy remains subject to the availability and quality of input features, highlighting the ongoing need for enhanced health, demographic, and climate data integration^27,28. Our findings reinforce the growing body of evidence supporting machine learning approaches for nutritional risk prediction in dynamic and resource-limited environments.

Our study reveals substantial heterogeneity in model performance between countries and nutritional status. While stunting predictions were most accurate overall (mean AUCs generally above 0.75), wasting and underweight were more challenging to predict, likely due to their data quality or reporting. Notably, the inclusion of meteorological data only marginally improved model performance, suggesting that, while weather factors may contribute to acute malnutrition, structural and household-level factors remain dominant predictors, especially for chronic outcomes like stunting. Still, further research leveraging finer-grained, high-frequency environmental data could yet uncover meaningful predictors at broader temporal or spatial scales. Consistent with prior studies^19,28, our results demonstrate that supervised machine learning approaches, particularly random forests and XGBoost, outperform traditional logistic regression in most settings. In Uganda, for instance, the random forest model achieved particularly high predictive accuracy for stunting (AUC = 0.8532), outperforming all other approaches. Similar patterns were observed in Malawi and Lesotho, where supervised Machine Learning models consistently yielded strong performance across all nutritional outcomes. These findings align with the growing consensus that classification models are better equipped to capture the complex, multifactorial nature of nutritional status among children, reflecting intricate interplays between socio-economic, demographic, and environmental determinants^{29,30,31,32,33,34}. Furthermore, our findings are broadly consistent with studies in SSA and South Asia that report rising temperatures as a risk factor for stunting and underweight in children^35,36,37. Unlike these earlier studies, our approach integrates machine-learning methods with causal inference, allowing the detection of non-linear associations, interaction effects across socioeconomic gradients, and heterogeneous vulnerability patterns. This highlights both the generalizability of climate-related nutritional risks and the added value of advanced analytical approaches for identifying the most vulnerable sub-populations.

A unique contribution of our work is the integration of causal-effect modeling to examine the impacts of temperature variability on nutrition. The regression analysis revealed a statistically significant negative association between rising temperatures and the nutritional status of children under five. Specifically, each 1 °C rise in average temperature was associated with higher odds of stunting (OR 1.01, 95% CI: 1.00–1.10), underweight (OR 1.03, 95% CI: 1.01–1.06), and wasting (OR 1.10, 95% CI: 1.08–1.12; all p < 0.005). Although the odds ratios for temperature effects are small, even modest increases in risk may have meaningful public health implications given the large population of children exposed, while individual clinical impact remains limited. The observed association between rising temperatures and poor nutritional outcomes may be explained by multiple interlinked mechanisms. Elevated temperatures are known to reduce crop yields and livestock productivity, thereby lowering household food availability and dietary diversity. This exacerbates food insecurity and may lead families to rely on cheaper, less nutritious diets, increasing the risk of stunting and underweight. Warmer conditions also favor the transmission of diarrheal and vector-borne diseases, which impair nutrient absorption and increase energy demands in children. Furthermore, higher temperatures often intensify water scarcity, reducing hygiene and sanitation practices, thereby compounding infection risks. Our results were similar with the previous studies that investigates the impacts of climate change on malnutrition in SSA^{38,39,40,41,42}. These findings robustly support the hypothesis that climate variability is exerting additional pressure on already vulnerable child populations, compounding existing risks from poverty, food insecurity, air pollution and poor health infrastructure^43,44. Our country-specific models further suggest that these associations are most pronounced in lower-income and lower-education settings, confirming that socioeconomic disadvantage amplifies children’s susceptibility to environmental stresses^45,46. However, certain country-level estimates, such as the elevated odds of wasting observed in Burkina Faso, warrant cautious interpretation, as they may arise from true contextual heterogeneity, variability in sample size, or potential data quality limitations.

The association between socioeconomic factors and nutritional outcomes remains strong throughout the dataset. Higher household income and maternal education levels were independently associated with decreased rates of stunting and underweight which were similarly from previous studies^47,48,49, underscoring the protective effect of social and economic development⁵⁰. The observed interaction between temperature rise and these socioeconomic variables suggests that future nutrition interventions must be holistic, addressing both structural inequalities and environmental challenges to maximize their impact.

A notable strength of this study is the unprecedented scale and scope, incorporating data from 345,837 participants across 22 SSA countries and spanning nearly two decades (2005–2023). The use of multiple machine learning algorithms, advanced cross-validation techniques, and integration of socioeconomic, demographic, and meteorological data provides a comprehensive and robust framework for predicting child nutritional outcomes. By comparing six different machine learning models, our study offers nuanced insights into model strengths and limitations for diverse malnutrition indicators. Furthermore, the simultaneous analysis of causal relationships between rising temperatures and child malnutrition using large datasets enables a deeper understanding of climate-related health vulnerabilities across SSA. This combination of methodological rigor, extensive geographic coverage, and integrated climate-health assessment sets our study apart and significantly advances the evidence base for data-driven nutrition and public health policies in the region.

Future research should focus on further enriching models with granular spatial, temporal, and behavioral inputs, testing their real-time surveillance value, and piloting targeted interventions guided by model outputs. Additionally, expansion to incorporate broader environmental, food system, and policy variables across different regions will provide actionable insights to drive multisectoral efforts against child malnutrition in SSA.

Despite the strengths and novel insights of this analysis, some of the limitations should be acknowledged. First, although the DHS and ECMWF datasets are the gold standard in their domains, inconsistencies in data quality, variable measurement, and survey intervals across countries may introduce bias or limit generalizability. Second, while our models capture associations, the use of cross-sectional secondary data restricts causal inference and precludes the assessment of temporal ordering or unmeasured confounding. Third, DHS cluster coordinates are displaced up to ~ 5 km for confidentiality, but most remain in the same area, so spatial mismatches with temperature data are likely minimal. Fourth, K-Nearest Neighbors, and Support Vector Machine (SVM) struggle to make classification during prediction and this could be associated with the quality of the datasets used. Although overall model accuracies were reasonable, predictive performance for acute outcomes such as wasting was modest, reflecting inherent challenges of machine-learning approaches in predicting low-prevalence or highly variable events. Finally, while machine learning models enhance predictive accuracy, they are often complex and less interpretable for policymakers. Future research should aim to develop user-friendly, explainable models that can be readily deployed by national and subnational health authorities.

In conclusion, our study demonstrates that integrating machine learning, causal inference, and environmental data substantially improves the prediction and understanding of child nutritional outcomes in SSA. We find that rising temperatures increase nutritional risks, particularly among socioeconomically vulnerable children, highlighting the need for targeted interventions such as region-specific nutrition programs and climate-resilient agricultural practices. Children in areas with limited access to resources are disproportionately affected, underscoring the importance of multisectoral strategies that integrate health, agriculture, and social protection measures. Investments in high-quality, harmonized datasets and capacity building for local decision-makers are essential to translate these modeling insights into sustainable improvements in child nutrition across the region.

Materials and methods

This study employs a quantitative research design to examine the effects of rising temperatures as a climatic factor on the nutritional status of under-five children in SSA. The analysis utilizes DHS data alongside meteorological data from the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA5 dataset. In this study, machine-learning models are used solely for prediction and to explore non-linear associations and interactions, whereas regression-based analyses are employed to estimate causal effects of temperature on nutritional outcomes.

The primary source of data for nutritional status indicators is the DHS database, which provides comprehensive information on health, nutrition, and demographic characteristics of populations in developing countries. The DHS are large-scale, nationally representative surveys that gather data on health, population, and nutrition indicators. Geographic coordinates for each survey cluster are collected on-site using GPS devices by trained survey teams, enabling spatial analysis of health and demographic patterns across regions. For this study, our focus was specifically on the DHS data pertaining to under-five children across 22 SSA countries (Benin, Burkina Faso, Burundi, Cameroon, Congo D.R, Ivory Coast, Ethiopia, Gabon, Ghana, Guinea, Kenya, Lesotho, Liberia, Malawi, Mali, Nigeria, Rwanda, Sierra Leone, Tanzania, Uganda, Zambia, and Zimbabwe). We combined all standard surveys conducted in respective country between 2005 and 2023 as indicated in Table S1 of the Supplementary materials. Key variables extracted from the DHS included measurements of height, weight, and age to calculate height-for-age (stunting), weight-for-age (underweight), and weight-for-height (wasting) using the World Health Organization (WHO) growth standards⁵¹. Our outcomes variables include (Wasting, stunting, and underweight). Wasting, stunting, and underweight were coded as binary variables using standard WHO cutoffs (z-score < − 2) to reflect clinically meaningful thresholds and to facilitate interpretation of model predictions. In addition, demographic variables such as age of under-five children, sex of under-five Children, socioeconomic status of the family where questionnaire was conducted, maternal education, and household characteristics were included to control for confounding factors as per previous studies conducted in different countries⁵².

Meteorological data was sourced from the ERA5 dataset, which provides high-resolution climate data for various atmospheric variables with spatial resolution of 0.25° × 0.25° longitude and latitude (approximately 30 km²)^52,53. This dataset includes daily average temperature and precipitation data, allowing for a comprehensive temporal analysis, and it has been used in many studies to assess the relation between climate change and health outcomes⁵⁴. Data was aggregated to calculate monthly and annual averages for temperature, corresponding to the time periods of the DHS data collection. Meteorological data was merged with the DHS data based on geographical identifiers (e.g., Cluster coordinate and ID) and time periods to ensure that temperature data correspond to the timing of the health survey.

Data processing and analysis

The data preprocessing involved cleaning the DHS data to include only under-five children whose information regarding nutritional status and geographical coordinate were presents. We performed a complete-case analysis, excluding observations with missing values. The proportion of missing data was low, and cases with missing values were similar to those with complete data, so any resulting bias is expected to be minimal. Nevertheless, we acknowledge that this approach could introduce some bias, particularly in multi-country pooled analyses. Meteorological data was merged with the DHS data based on geographical identifiers (e.g., Cluster coordinate and ID) as indicate in Figure S1 of Supplementary Materials and time periods to ensure that temperature data correspond to the timing of the health survey. Descriptive statistics was calculated for demographic and nutritional variables to provide an overview of the study population. A multiple supervised machine learning algorithm, was employed to predict the effects of rising temperatures on nutritional status (underweight, stunting, and wasting), with models trained on a subset of the data and cross-validation used to assess their performance. The dataset was split into training and testing subsets using 80:20 split (Total number of participants N = 345837, training samples = 276669, and testing sample = 69168), ensuring that model validation was conducted on unseen data to prevent overfitting. Socioeconomic variables including household wealth, maternal education, and access to water and sanitation were incorporated in both causal-effect and machine-learning models to account for confounding and to explore potential effect modification of temperature impacts on child nutritional outcomes.

We implemented several machine learning classifiers such as logistic regression, random forest, XG Boost, K-Nearest Neighbors, Support Vector Machine (SVM) and decision tree to predict Nutritional status. Logistic regression models the log-odds of the outcome as a linear function of predictors. Random forests aggregate predictions from multiple decision trees trained on bootstrap samples, using majority voting and decision trees partition the feature space to reduce impurity measured by Gini index⁵⁵. XG Boost sequentially builds trees to minimize logistic loss with regularization⁵⁶. SVM finds a hyperplane that best separates classes by maximizing the margin between them, optionally using kernel functions to handle nonlinear boundaries⁵⁷.

For model interpretability and to identify the most influential predictors of nutritional outcomes, we determined feature importance using two complementary strategies. In tree-based models (Random Forest, XG Boost), importance was quantified based on each feature’s mean decrease in impurity (Gini importance) or average gain improvement across splits as shown in Figure S1 of the Supplementary materials. For logistic regression, standardized coefficients were considered as indicators of predictor influence. After model fitting, features were ranked by their calculated importance, and the top contributors were visualized to provide actionable insights for policy and intervention priorities as explained from the previous study⁵⁸. Potential multicollinearity among predictors was assessed in the logistic regression models to ensure robustness of feature selection as shown in Table S3 of the Supplementary Materials.

To address the problem of class imbalance, particularly for wasting and underweight categories, which had substantially fewer positive cases, we implemented the Synthetic Minority Over-sampling Technique (SMOTE) prior to model training. SMOTE was applied to the training set only, creating synthetic examples in the minority class by interpolating between existing observations and their nearest neighbors in feature space as revealed from the previous study^59,60. This approach improved the balance between classes, ultimately enhancing model recall and F1-score performance, especially for our outcomes that comprises of severe imbalance. However, synthetic data can risk overfitting and may distort the original distribution; all models were rigorously evaluated on an independent test set that was not exposed to the oversampling process. This ensures that the reported model performance reflects the ability to generalize to unseen, real-world data rather than being influenced by synthetic samples.

To evaluate the performance of the classification models developed for nutritional status prediction, we employed a combination of widely accepted evaluation metrics. The primary performance metrics used were Accuracy, F1 Score, and the Area Under the Receiver Operating Characteristic Curve (AUC-ROC). Accuracy quantifies the overall proportion of correctly classified instances, while the F1 Score provides a harmonic mean of Precision and Recall, offering a balanced evaluation. All model performance metrics, including accuracy, precision, recall, F1-score, and AUC, were calculated using predictions on the independent test dataset, which was not used during model training or tuning.

The relationship between rising temperatures and nutritional status was evaluated using Multivariable logistic regression, with specific attention to the hypothesized negative impact of temperature increases on indicators of nutritional status among under-five children. Figure 5 shows the methodological approach used to for the analysis and more details on cluster location and country included in the study has been shown by Figure S2 of the Supplementary Materials.

All machine-learning analyses were performed using Python (via Jupyter Notebook) with standard libraries including scikit-learn, pandas, NumPy, and matplotlib. Causal inference analyses were conducted in R (version 4.4.1).

This study utilized publicly available secondary data; therefore, no ethical approval is required. However, data usage was adhered to the guidelines and regulations outlined by the DHS and ECMWF.

Data availability

The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Agostoni, C., Baglioni, M., La Vecchia, A., Molari, G. & Berti, C. Interlinkages between climate change and food systems: the impact on child Malnutrition-Narrative review. Nutrients 15 https://doi.org/10.3390/nu15020416 (2023).
Myers, S. S. et al. Climate change and global food systems: potential impacts on food security and undernutrition. Annu. Rev. Public. Health. 38, 259–277. https://doi.org/10.1146/annurev-publhealth-031816-044356 (2017).
Article PubMed Google Scholar
Wang, W., Mensah, I. A., Atingabili, S. & Omari-Sasu, A. Y. Climate change as a game changer: rethinking africa’s food security- health outcome nexus through a multi-sectoral lens. Sci. Rep. 15, 16824. https://doi.org/10.1038/s41598-025-99276-2 (2025).
Article ADS CAS PubMed PubMed Central Google Scholar
Mirzabaev, A. et al. Severe climate change risks to food security and nutrition. Clim. Risk Manage. 39, 100473. https://doi.org/10.1016/j.crm.2022.100473 (2023). https://doi.org/https://doi.org/
Article Google Scholar
Johnson, K. & Brown, M. E. Environmental risk factors and child nutritional status and survival in a context of climate variability and change. Appl. Geogr. 54, 209–221. https://doi.org/10.1016/j.apgeog.2014.08.007 (2014).
Article Google Scholar
Jayne, T. S. & Sanchez, P. A. Agricultural productivity must improve in sub-Saharan Africa. Science 372, 1045–1047. https://doi.org/10.1126/science.abf5413 (2021).
Article ADS CAS PubMed Google Scholar
Maja, M. M. & Ayano, S. F. The impact of population growth on natural resources and farmers’ capacity to adapt to climate change in Low-Income countries. Earth Syst. Environ. 5, 271–283. https://doi.org/10.1007/s41748-021-00209-6 (2021).
Article ADS Google Scholar
Giller, K. E. et al. Small farms and development in sub-Saharan africa: farming for food, for income or for lack of better options? Food Secur. 13, 1431–1454 (2021).
Article Google Scholar
Lesk, C. et al. Compound heat and moisture extreme impacts on global crop yields under climate change. Nat. Reviews Earth Environ. 3, 872–889. https://doi.org/10.1038/s43017-022-00368-8 (2022).
Article ADS Google Scholar
Nega, A. Climate change impacts on agriculture: A Review of plant diseases and insect pests in Ethiopia and East Africa, with adaptation and mitigation strategies. Adv. Agric. 5606701. https://doi.org/10.1155/aia/5606701 (2025).
Wudil, A. H., Usman, M., Rosak-Szyrocka, J., Pilař, L. & Boye, M. Reversing years for global food security: A review of the food security situation in Sub-Saharan Africa (SSA). Int. J. Environ. Res. Public. Health. 19 https://doi.org/10.3390/ijerph192214836 (2022).
Erenstein, O., Jaleta, M., Sonder, K., Mottaleb, K. & Prasanna, B. M. Global maize production, consumption and trade: trends and R&D implications. Food Secur. 14, 1295–1319. https://doi.org/10.1007/s12571-022-01288-7 (2022).
Article Google Scholar
Siderius, C., van der Velde, Y., Gülpen, M., de Bruin, S. & Biemans, H. Improved water management can increase food self-sufficiency in urban foodsheds of Sub-Saharan Africa. Global Food Secur. 42, 100787. https://doi.org/10.1016/j.gfs.2024.100787 (2024).
Article Google Scholar
Yadava, P. K., Kumar, H., Singh, A., Kumar, V. & Verma, S. In Visualization Techniques for Climate Change with Machine Learning and Artificial Intelligence 39–54 (Elsevier, 2023).
Khudri, M. M., Rhee, K. K., Hasan, M. S. & Ahsan, K. Z. Predicting nutritional status for women of childbearing age from their economic, health, and demographic features: A supervised machine learning approach. PLOS ONE. 18, e0277738. https://doi.org/10.1371/journal.pone.0277738 (2023).
Article CAS PubMed PubMed Central Google Scholar
Jha, P., Chinngaihlian, S., Upreti, P. & Handa, A. A machine learning approach to assess implications of climate risk factors on agriculture: the Indian case. Clim. Risk Manage. 41, 100523. https://doi.org/10.1016/j.crm.2023.100523 (2023).
Article Google Scholar
Chen, L. et al. Machine learning methods in weather and climate applications: A survey. Appl. Sci. 13, 12019. https://doi.org/10.3390/app132112019 (2023).
Article CAS Google Scholar
Ssebyala, S. N. et al. Use of machine learning tools to predict health risks from climate-sensitive extreme weather events: A scoping review. PLOS Clim. 3, e0000338 (2024).
Article Google Scholar
Zirbo, S. G., Hoszu, B. S., Dioşan, L. S., Coroiu, A. M. & Croitoru, A. E. Predicting health outcomes using weather data: A dual machine learning approach. Procedia Comput. Sci. 246, 1399–1408 (2024).
Article Google Scholar
Yoo, C., Ramirez, L. & Liuzzi, J. Big data analysis using modern statistical and machine learning methods in medicine. Int. Neurourol. J. 18, 50–57. https://doi.org/10.5213/inj.2014.18.2.50 (2014).
Article PubMed PubMed Central Google Scholar
Venkatesh, R., Balasubramanian, C. & Kaliappan, M. Development of big data predictive analytics model for disease prediction using machine learning technique. J. Med. Syst. 43, 272. https://doi.org/10.1007/s10916-019-1398-y (2019).
Article CAS PubMed Google Scholar
Uddin, S., Khan, A., Hossain, M. E. & Moni, M. A. Comparing different supervised machine learning algorithms for disease prediction. BMC Med. Inf. Decis. Mak. 19, 281. https://doi.org/10.1186/s12911-019-1004-8 (2019).
Article Google Scholar
Talukder, A. & Ahammed, B. Machine learning algorithms for predicting malnutrition among under-five children in Bangladesh. Nutrition 78, 110861. https://doi.org/10.1016/j.nut.2020.110861 (2020). https://doi.org/https://doi.org/
Article PubMed Google Scholar
Begashaw, G. B., Zewotir, T. & Fenta, H. M. A deep learning approach for classifying and predicting children’s nutritional status in Ethiopia using LSTM-FC neural networks. BioData Min. 18, 11. https://doi.org/10.1186/s13040-025-00425-0 (2025).
Article PubMed PubMed Central Google Scholar
Ngusie, H. S. et al. Employing machine learning techniques for prediction of micronutrient supplementation status during pregnancy in East African countries. Sci. Rep. 14, 23827. https://doi.org/10.1038/s41598-024-75455-5 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
WHO. WHO Child Growth Standards based on length/height, weight and age. (2006).
Gettelman, A. et al. The future of Earth system prediction: advances in model-data fusion. Sci. Adv. 8, eabn3488. https://doi.org/10.1126/sciadv.abn3488 (2022).
Article PubMed PubMed Central Google Scholar
Berrang-Ford, L. et al. Systematic mapping of global research on climate and health: a machine learning review. Lancet Planet. Health. 5, e514–e525 (2021).
Article PubMed PubMed Central Google Scholar
Yang, M. & Zou, Y. Assessing environmental determinants of subjective well-being via machine learning approaches: a systematic review. Humanit. Social Sci. Commun. 12, 828. https://doi.org/10.1057/s41599-025-05234-8 (2025).
Article Google Scholar
Selemani, B., Machuve, D. & Mduma, N. Machine learning model for predicting fetal nutritional status. Comput. Ecol. Softw. 14, 68 (2024).
Google Scholar
Turjo, E. A. & Rahman, M. H. Assessing risk factors for malnutrition among women in Bangladesh and forecasting malnutrition using machine learning approaches. BMC Nutr. 10, 22. https://doi.org/10.1186/s40795-023-00808-8 (2024).
Article PubMed PubMed Central Google Scholar
Islam, M. M. et al. Application of machine learning based algorithm for prediction of malnutrition among women in Bangladesh. Int. J. Cogn. Comput. Eng. 3, 46–57. https://doi.org/10.1016/j.ijcce.2022.02.002 (2022).
Article Google Scholar
Bitew, F. H., Sparks, C. S. & Nyarko, S. H. Machine learning algorithms for predicting undernutrition among under-five children in Ethiopia. Public Health. Nutr. 25, 269–280. https://doi.org/10.1017/S1368980021004262 (2022).
Article PubMed Google Scholar
Qasrawi, R. et al. Machine learning approach for predicting the impact of food insecurity on nutrient consumption and malnutrition in children aged 6 months to 5 years. Children 11, 810 (2024).
Article PubMed PubMed Central Google Scholar
Tusting, L. S. et al. Environmental temperature and growth faltering in African children: a cross-sectional study. Lancet Planet. Health. 4, e116–e123. https://doi.org/10.1016/S2542-5196(20)30037-1 (2020).
Article PubMed PubMed Central Google Scholar
Baker, R. E. & Anttila-Hughes, J. Characterizing the contribution of high temperatures to child undernourishment in Sub-Saharan Africa. Sci. Rep. 10, 18796. https://doi.org/10.1038/s41598-020-74942-9 (2020).
Article CAS PubMed PubMed Central Google Scholar
Mishra, R. & Bera, S. Geospatial and environmental determinants of stunting, wasting, and underweight: empirical evidence from rural South and Southeast Asia. Nutrition 120, 112346. https://doi.org/10.1016/j.nut.2023.112346 (2024).
Article PubMed Google Scholar
Kemajou Njatang, D., Bouba Djourdebbé, F. & Adda Wadou, N.D. Climate variability, armed conflicts and child malnutrition in sub-saharan africa: A Spatial analysis in Ethiopia, Kenya and Nigeria. Heliyon 9, e21672. https://doi.org/10.1016/j.heliyon.2023.e21672 (2023).
Article PubMed PubMed Central Google Scholar
van der Merwe, E., Clance, M. & Yitbarek, E. Climate change and child malnutrition: A Nigerian perspective. Food Policy. 113, 102281. https://doi.org/10.1016/j.foodpol.2022.102281 (2022).
Article Google Scholar
Adesete, A. A., Olanubi, O. E. & Dauda, R. O. Climate change and food security in selected Sub-Saharan African countries. Environ. Dev. Sustain. 25, 14623–14641. https://doi.org/10.1007/s10668-022-02681-0 (2023).
Article Google Scholar
Tamasiga, P., Onyeaka, H., Akinsemolu, A. & Bakwena, M. The Inter-Relationship between climate Change, Inequality, poverty and food security in africa: A bibliometric review and content analysis approach. Sustainability 15, 5628. https://doi.org/10.3390/su15075628 (2023).
Article ADS Google Scholar
Grace, K. et al. Conflict and climate factors and the risk of child acute malnutrition among children aged 24–59 months: A comparative analysis of Kenya, Nigeria, and Uganda. Spat. Demography. 10, 329–358. https://doi.org/10.1007/s40980-021-00102-w (2022).
Article Google Scholar
Sahoo, P. M., Rout, H. S. & Jakovljevic, M. Future health expenditure in the BRICS countries: a forecasting analysis for 2035. Globalization Health. 19, 49 (2023).
Article PubMed PubMed Central Google Scholar
Kumar, V., Tripathi, T., Raj, A. & Kumar, P. Investigating the geographic linkage between airborne pollutants and tuberculosis rates in India. Discover Public. Health. 22, 660 (2025).
Article Google Scholar
Sahoo, P. M., Rout, H. S. & Jakovljevic, M. Dynamics of health financing among the BRICS: a literature review. Sustainability 15, 12385 (2023).
Article ADS Google Scholar
Kumar, V. & Tripathi, T. Timely access to public health facilities for pregnancy care in tribal Gujarat. Economic Political Wkly. 59, 51 (2024).
Google Scholar
Lawal, S. A., Okunlola, D. A., Adegboye, O. A. & Adedeji, I. A. Mother’s education and nutritional status as correlates of child stunting, wasting, underweight, and overweight in nigeria: evidence from 2018 demographic and health survey. Nutr. Health. 30, 821–830. https://doi.org/10.1177/02601060221146320 (2024).
Article PubMed Google Scholar
Li, H. et al. Prevalence and associated factors for stunting, underweight and wasting among children under 6 years of age in rural Hunan Province, china: a community-based cross-sectional study. BMC Public. Health. 22, 483. https://doi.org/10.1186/s12889-022-12875-w (2022).
Article PubMed PubMed Central Google Scholar
Siddiqa, M., Zubair, A., Kamal, A., Ijaz, M. & Abushal, T. Prevalence and associated factors of stunting, wasting and underweight of children below five using quintile regression analysis (PDHS 2017–2018). Sci. Rep. 12, 20326. https://doi.org/10.1038/s41598-022-24063-2 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Kumar, V., Sahoo, P. M., Tripathi, T. & Rout, H. S. Disparities in Spatial access to healthcare facilities for pregnant women: a case study of a hilly region in India. BMC Pregn. Childbirth (2025).
Géa-Horta, T., Felisbino-Mendes, M. S., Ortiz, R. J. & Velasquez-Melendez, G. Association between maternal socioeconomic factors and nutritional outcomes in children under 5 years of age. J. Pediatr. (Rio J). 92, 574–580. https://doi.org/10.1016/j.jped.2016.02.010 (2016).
Article PubMed Google Scholar
Hersbach, H. et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 146, 1999–2049. https://doi.org/10.1002/qj.3803 (2020). https://doi.org/https://
Article ADS Google Scholar
Urban, A. et al. Evaluation of the ERA5 reanalysis-based universal thermal climate index on mortality data in Europe. Environ. Res. 198, 111227 (2021).
Article CAS PubMed Google Scholar
Di Napoli, C. et al. The role of global reanalyses in climate services for health: insights from the lancet countdown. Meteorol. Appl. 30, e2122 (2023).
Article ADS Google Scholar
Becker, T., Rousseau, A. J., Geubbelmans, M., Burzykowski, T. & Valkenborg, D. Decision trees and random forests. Am. J. Orthod. Dentofac. Orthop. 164, 894–897. https://doi.org/10.1016/j.ajodo.2023.09.011 (2023).
Article Google Scholar
Didrik, N. Tree Boosting with Xgboost-why Does Xgboost Win Every Machine Learning competition? (NTNU, 2016).
Huang, S. et al. Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genomics Proteom. 15, 41–51. https://doi.org/10.21873/cgp.20063 (2018).
Article CAS Google Scholar
Amarasinghe, K., Rodolfa, K. T., Lamba, H. & Ghani, R. Explainable machine learning for public policy: use cases, gaps, and research directions. Data Policy. 5 (e5). https://doi.org/10.1017/dap.2023.2 (2023).
Elreedy, D. & Atiya, A. F. A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class imbalance. Inf. Sci. 505, 32–64. https://doi.org/10.1016/j.ins.2019.07.070 (2019). https://doi.org/https://doi.org/
Article Google Scholar
Elreedy, D., Atiya, A. F. & Kamalov, F. A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning. Mach. Learn. 113, 4903–4923. https://doi.org/10.1007/s10994-022-06296-4 (2024).
Article ADS MathSciNet Google Scholar

Download references

Acknowledgements

J.B., A.M., and B.T., was supported by the National Institutes of Health (NIH) through Research Training on Harnessing Data Science for Global Health Priorities in Africa (1U2RTW012140-01). H.K was found by National Natural Science Foundation of China (82430105), Shanghai Municipal Science and Technology Major Project, Shanghai B&R Joint Laboratory Project, H.K and R.C was supported by Shanghai International Science and Technology Partnership Project (21230780200.

Funding

This research was funded by the National Institutes of Health (NIH) through Research Training on Harnessing Data Science for Global Health Priorities in Africa (1U2RTW012140-01).

Author information

Authors and Affiliations

Department of Global Health and Population, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Jovine Bachwenkizi, Cheng He, Alice Mugisha, Boikhutso Tlou & Wafaie W. Fawzi
School of Public Health, Key Lab of Public Health Safety of the Ministry of Education, NHC Key Lab of Health Technology Assessment, IRDR ICoE on Risk, Interconnectivity and Governance on Weather/Climate Extremes Impact and Public Health, Fudan University, Shanghai, China
Yixiang Zhu, Renjie Chen & Haidong Kan
Department of Information Technology, School of Computing and Informatics Technology, College of Computing and Information Sciences, Makerere University, Kampala, Uganda
Alice Mugisha
Department of Public Health Medicine, Faculty of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
Boikhutso Tlou
Department of Epidemiology and Biostatistics, Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
Candida Moshiro
School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Pietermaritzburg, South Africa
Henry Mwambi
Division of Community Health Sciences, School of Public Health, University of California, Berkeley, CA, USA
Isabel Madzorera
Department of Environmental and Occupational Health, Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
Jovine Bachwenkizi
Children’s Hospital of Fudan University, National Center for Children’s Health, Shanghai, China
Haidong Kan
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Wafaie W. Fawzi
Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Wafaie W. Fawzi

Authors

Jovine Bachwenkizi
View author publications
Search author on:PubMed Google Scholar
Cheng He
View author publications
Search author on:PubMed Google Scholar
Yixiang Zhu
View author publications
Search author on:PubMed Google Scholar
Alice Mugisha
View author publications
Search author on:PubMed Google Scholar
Boikhutso Tlou
View author publications
Search author on:PubMed Google Scholar
Candida Moshiro
View author publications
Search author on:PubMed Google Scholar
Henry Mwambi
View author publications
Search author on:PubMed Google Scholar
Isabel Madzorera
View author publications
Search author on:PubMed Google Scholar
Renjie Chen
View author publications
Search author on:PubMed Google Scholar
Haidong Kan
View author publications
Search author on:PubMed Google Scholar
Wafaie W. Fawzi
View author publications
Search author on:PubMed Google Scholar

Contributions

J.B contributed to the conceptualization, conducted the statistical analysis, and took the lead in drafting the manuscript and interpreting the results. C.H, Y.Z., A.M., B.T., contributed to data collection, interpretation of the results and to drafting the manuscript. C.M., H.M., I.M., R.J., H.K., and W.F., contributed to funding acquisition, project administration, and supervision of the study. All authors contributed to the development of the manuscript and approved the final draft.

Corresponding author

Correspondence to Jovine Bachwenkizi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (download DOCX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Bachwenkizi, J., He, C., Zhu, Y. et al. Predicting the effects of temperature variability on nutritional status of children under five in Sub-Saharan Africa using machine learning. Sci Rep 16, 8055 (2026). https://doi.org/10.1038/s41598-026-39659-1

Download citation

Received: 28 August 2025
Accepted: 06 February 2026
Published: 10 February 2026
Version of record: 04 March 2026
DOI: https://doi.org/10.1038/s41598-026-39659-1