Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Monitoring global development aid with machine learning

Abstract

Monitoring global development aid provides important evidence for policymakers financing the Sustainable Development Goals (SDGs). To overcome the limitations of existing monitoring, we develop a machine learning framework that enables a comprehensive and granular categorization of development aid activities based on their textual descriptions. Specifically, we cluster the descriptions of ~3.2 million aid activities conducted between 2000 and 2019 totalling US$2.8 trillion. As a result, we generated 173 activity clusters representing the topics of underlying aid activities. Among them, 70 activity clusters cover topics that have not yet been analysed empirically (for example, greenhouse gas emissions reduction and maternal health care). On the basis of our activity clusters, global development aid can be monitored for new topics and at new levels of granularity, allowing the identification of unexplored spatio-temporal disparities. Our framework can be adopted by development finance and policy institutions to promote evidence-based decisions targeting the SDGs.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of machine learning framework to generate activity clusters.
Fig. 2: Overview of activity clusters generated by the machine learning framework.
Fig. 3: Overview of activity clusters representing ‘newly captured’ activity clusters.
Fig. 4: Global distribution of development aid allocated to recipient countries.

Similar content being viewed by others

Data availability

Activity clusters can be monitored interactively via https://maltetoetzke.github.io/Monitoring-Global-Development-Aid/. The underlying data can be retrieved via https://github.com/MalteToetzke/Monitoring-Global-Development-Aid-With-Machine-Learning. For access to the raw data, please contact the DAC of the OECD.

Code availability

The scripts used for preprocessing the data and generating activity clusters can be retrieved via https://github.com/MalteToetzke/Monitoring-Global-Development-Aid-With-Machine-Learning. Analysis scripts are available on request from M.T.

References

  1. Liu, J. et al. Systems integration for global sustainability. Science 347, (2015).

  2. Sustainable Development Goals: The Sustainable Development Agenda (United Nations, 2015); https://www.un.org/sustainabledevelopment/development-agenda/

  3. The Sustainable Development Goals Report 2018 (United Nations, 2018); https://unstats.un.org/sdgs/report/2018/

  4. Global Indicator Framework for the Sustainable Development Goals and Targets of the 2030 Agenda for Sustainable Development (United Nations, 2019); https://unstats.un.org/sdgs/indicators/indicators-list/

  5. World Investment Report 2014; Investing in the SDGs: An Action Plan (United Nations, 2014); https://unctad.org/en/PublicationsLibrary/wir2014_en.pdf

  6. Development Co–operation Report 2018: Joining Forces to Leave No One Behind (OECD, 2018); http://www.oecd.org/social/development-co-operation-report-20747721.htm

  7. Development Co–operation Report 2019: A Fairer, Greener, Safer Tomorrow (OECD, 2019); http://www.oecd.org/dac/development-co-operation-report-20747721.htm

  8. Nunnenkamp, P., Öhler, H. & Thiele, R. Donor coordination and specialization: did the Paris declaration make a difference? Rev. World Econ. 149, 537–563 (2013).

    Article  Google Scholar 

  9. Easterly, W. & Pfutze, T. Where does the money go? Best and worst practices in foreign aid. J. Econ. Perspect. 22, 29–52 (2008).

    Article  Google Scholar 

  10. Clemens, M. A., Kenny, C. J. & Moss, T. J. The trouble with the MDGs: confronting expectations of aid and development success. World Dev. 35, 735–751 (2007).

    Article  Google Scholar 

  11. Kenny, C. What is effective aid? How would donors allocate it? (World Bank, 2006).

  12. Tierney, M. J. et al. More dollars than sense: refining our knowledge of development finance using AidData. World Dev. 39, 1891–1906 (2011).

    Article  Google Scholar 

  13. Pitt, C., Grollman, C., Martinez-Alvarez, M., Arregoces, L. & Borghi, J. Tracking aid for global health goals: a systematic comparison of four approaches applied to reproductive, maternal, newborn, and child health. Lancet Glob. Health 6, 859–874 (2018).

    Article  Google Scholar 

  14. Toward Mutual Accountability: The 2015 Adaptation Finance Transparency Gap Report (Adaptation Watch, 2015).

  15. State of Inequality: Reproductive Maternal Newborn and Child Health; Interactive Visualization of Health Data (World Health Organization, 2015).

  16. Flogstad, C. & Hagen, R. J. Aid dispersion: measurement in principle and practice. World Dev. 97, 232–250 (2017).

    Article  Google Scholar 

  17. Creditor reporting system 2019. OECD Statistics https://stats.oecd.org/DownloadFiles.aspx?DatasetCode=CRS1 (2022).

  18. Comparative Study of Data Reported to the OECD Creditor Reporting System (CRS) and to the Aid Management Platform (AMP) (OECD, 2009).

  19. Purpose Codes: Sector Classification (OECD, 2021); https://www.oecd.org/development/financing-sustainable-development/development-finance-standards/purposecodessectorclassification.htm

  20. Burke, M., Driscoll, A., Lobell, D. B. & Ermon, S. Using satellite imagery to understand and promote sustainable development. Science 371, (2021).

  21. Kinyoki, D. K. Mapping child growth failure across low-and middle-income countries. Nature 577, 231–234 (2020).

    Article  Google Scholar 

  22. Local Burden of Disease Educational Attainment Collaborators Mapping disparities in education across low-and-middle-income countries. Nature 577, 235–238 (2020).

    Article  Google Scholar 

  23. Ricciardi, V. et al. A scoping review of research funding for small-scale farmers in water scarce regions. Nat. Sustain 3, 836–844 (2020).

    Article  Google Scholar 

  24. Xie, M., Jean, N., Burke, M., Lobell, D. & Ermon, S. Transfer learning from deep features for remote sensing and poverty mapping. In Proc. 30th AAAI Conference on Artificial Intelligence (AAAI Press, 2016).

  25. Blumenstock, J., Cadamuro, G. & On, R. Predicting poverty and wealth from mobile phone metadata. Science 350, 1073–1076 (2015).

    Article  CAS  Google Scholar 

  26. Nature Editorial How science can put the Sustainable Development Goals back on track. Nature 589, 329–330 (2021).

    Article  Google Scholar 

  27. Glossary of statistical terms: sector of destination (of aid). OECD Statistics https://stats.oecd.org/glossary/detail.asp?ID=6808 (2005).

  28. GHG data from UNFCCC. UNFCCC https://unfccc.int/process-and-meetings/transparency-and-reporting/greenhouse-gas-data/ghg-data-unfccc/ghg-data-from-unfccc (2021).

  29. Adoption of the Paris Agreement FCCC/CP/2015/L.9/Rev.1 (UNFCCC, 2015).

  30. Glennie, J. & Sumner, A. Aid, Growth and Poverty (Springer, 2016).

  31. Qian, N. Making progress on foreign aid. Annu. Rev. Econ. 7, 277–308 (2015).

    Article  Google Scholar 

  32. Jakubik, J. & Feuerriegel, S. Data-driven allocation of development aid towards sustainable development goals: evidence from HIV/AIDS, Production and Operations Management (2022).

  33. About us. World Food Programme Innovation Accelerator https://innovation.wfp.org/about-us (2021).

  34. About givedirectly. GiveDirectly https://www.givedirectly.org/about/ (2021).

  35. Adelman, M., Haimovich, F., Ham, A. & Vazquez, E. Predicting school dropout with administrative data: new evidence from Guatemala and Honduras. Educ. Econ. 26, 356–372 (2018).

    Article  Google Scholar 

  36. Calantropio, A., Chiabrando, F., Codastefano, M. & Bourke, E. Deep learning for automatic building damage assessment: application in post-disaster scenarios using UAV data. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 1, 113–120 (2021).

    Article  Google Scholar 

  37. Glossary of statistical terms: aid activities. OECD Statistics https://stats.oecd.org/glossary/detail.asp?ID=6807 (2005).

  38. Development Finance Standards (OECD, 2020); http://www.oecd.org/dac/financing-sustainable-development/development-finance-standards/

  39. spacy-langdetect (SpaCy, 2019); https://spacy.io/universe/project/spacy-langdetect

  40. Natural language toolkit (NLTK, 2019); https://www.nltk.org/

  41. Hornik, K., Rauch, J., Buchta, C. & Feinerer, I. textcat: N-Gram Based Text Categorization. R version 3.2.0 https://cran.r-project.org/web/packages/textcat/textcat.pdf (2018).

  42. Cloud translation API (Google Cloud, 2019); https://cloud.google.com/translate/docs/reference/rest/

  43. Le, Q. & Mikolov, T. Distributed representations of sentences and documents. Proc. Mach. Learn. Res. 32, 1188–1196 (2014).

    Google Scholar 

  44. Dai, A. M., Olah, C. & Le, Q. V. Document embedding with paragraph vectors. Preprint at arXiv https://doi.org/10.48550/arXiv.1507.07998 (2015).

  45. Campr, M. & Ježek, K. in International Conference on Text, Speech, and Dialogue (eds. Král, P. & Matoušek, V.) 252–260 (Springer, 2015).

  46. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. & Dean, J. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems (eds. Burges, C. J. C. et al.) 3111–3119 (Curran Associates, Inc., 2013).

  47. Goodman, J. Classes for fast maximum entropy training. In IEEE International Conference on Acoustics, Speech, and Signal Processing. 561–564 (IEEE, 2001).

  48. Bottou, L. Large-scale machine learning with stochastic gradient descent. In Proc. of COMPSTATʹ2010 (eds. Lechevallier, Y. & Saporta, G.) 177–186 (Springer, 2010).

  49. Rousseeuw, P. J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987).

    Article  Google Scholar 

  50. Arthur, D. & Vassilvitskii, S. k-means++: The Advantages of Careful Seeding (Stanford Univ., 2006).

  51. Dhillon, I. S. & Modha, D. S. Concept decompositions for large sparse text data using clustering. Mach. Learn. 42, 143–175 (2001).

    Article  Google Scholar 

  52. Wu, H. C., Luk, R. W. P., Wong, K. F. & Kwok, K. L. Interpreting tf–idf term weights as making relevance decisions. ACM Trans. Inf. Syst. 26, 1–37 (2008).

    Article  CAS  Google Scholar 

  53. Chang, J., Boyd-Graber, J., Wang, C., Gerrish, S. & Blei, D. M. Reading tea leaves: how humans interpret topic models. Adv. Neural Inf. Process. Syst. 32, 288–296 (2009).

    Google Scholar 

  54. Foreign Aid Explorer (USAID, 2021); https://explorer.usaid.gov/

  55. van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).

    Google Scholar 

Download references

Acknowledgements

We thank the SDG financing lab of the OECD for the provision of the raw data and the mutual exchange over the course of this study. Furthermore, we would like to thank all researchers from the Swiss Federal Institute of Technology (ETH Zurich) who helped us in evaluating and naming activity clusters.

Author information

Authors and Affiliations

Authors

Contributions

M.T. performed data analysis and visualized the results. All the authors contributed to the conceptualization, interpretation of the results and the writing of the paper.

Corresponding author

Correspondence to Malte Toetzke.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Sustainability thanks Max Callaghan, Lynn Kaack and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Discussions 1–5, Figs. 1–12 and Tables 1–9.

Reporting Summary.

Supplementary Data 1

Descriptive statistics of activity clusters from Supplementary Table 9.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Toetzke, M., Banholzer, N. & Feuerriegel, S. Monitoring global development aid with machine learning. Nat Sustain 5, 533–541 (2022). https://doi.org/10.1038/s41893-022-00874-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41893-022-00874-z

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing