Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Expert assignment system based on natural language processing for Marie Sklodowska-Curie actions
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 27 January 2026

Expert assignment system based on natural language processing for Marie Sklodowska-Curie actions

  • Elena Álvarez-García1,
  • Daniel García-Costa1,
  • Ilse De Waele2,
  • Ana Marusic3 &
  • …
  • Francisco Grimaldo1 

Scientific Reports , Article number:  (2026) Cite this article

  • 446 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Computer science
  • Scientific data
  • Software

Abstract

Assigning experts to project proposals is a critical process in research evaluation. Traditional Information Retrieval (IR) methods, such as the Single Evaluation Platform (SEP) used by the European Research Executive Agency, automatically assign experts based on keyword matching, but these assignments are subsequently reviewed and corrected by Vice Chairs (VCs) to ensure suitability. To address the limitations of keyword-based systems and enhance semantic relevance, we developed a novel expert assignment system leveraging Natural Language Processing with Large Language Models (LLMs). Our approach integrates dynamic retrieval of expert publications via ORCID with GALACTICA, a specialized scientific LLM, to compute fine-grained semantic similarity between publications and proposal abstracts. Using a dataset of 48 experts and 181 proposals, we evaluated three similarity aggregation strategies: Sum, Product, and Maximum. The Maximum similarity approach most closely replicated VCs-reviewed assignments, achieving an AUC of 0.82, significantly outperforming the traditional SEP system (AUC = 0.75), Sum (AUC = 0.69), and Product (AUC = 0.57). These results demonstrate that focusing on the single most relevant match effectively captures human decision-making, highlighting the potential of LLM-based semantic matching to provide a more accurate and scalable alternative to existing IR systems. Furthermore, unlike SEP’s discrete affinity scores, our aggregation strategies produce highly discriminative, fine-grained ratings, allowing for more nuanced differentiation among candidate experts.

Similar content being viewed by others

Divergent creativity in humans and large language models

Article Open access 21 January 2026

Large language model performance versus human expert ratings in automated suicide risk assessment

Article Open access 10 November 2025

Leveraging large language models to assist philosophical counseling: prospective techniques, value, and challenges

Article Open access 06 March 2025

Data availability

The data for replication are available at this link: https://dataverse.harvard.edu/previewurl.xhtml?token=9ddf76ed-c05f-4b96-a2da-f1f4ad79dfcd

References

  1. Zhao, X. & Zhang, Y. Reviewer assignment algorithms for peer review automation: A survey. Inf. Process. Manag. 59, 103028. https://doi.org/10.1016/j.ipm.2022.103028 (2022).

    Google Scholar 

  2. Commission, E. Marie skłodowska-curie actions work programme 2018-2020 (2018). https://ec.europa.eu/research/participants/data/ref/h2020/wp/2018-2020/main/h2020-wp1820-msca_en.pdf.

  3. Baumert, P., Cenni, F. & Ten Antonkine, M. L. simple rules for a successful eu marie skłodowska-curie actions postdoctoral (msca) fellowship application. PLoS Comput. Biol. 18, e1010371. https://doi.org/10.1371/journal.pcbi.1010371 (2022).

    Google Scholar 

  4. Commission, E. Postdoctoral fellowships - marie skłodowska-curie actions.

  5. European Commission, R. & Portal, I. F. T. Evaluate a proposal—it how to—evaluation tool. https://webgate.ec.europa.eu/funding-tenders-opportunities/display/IT/Evaluate+a+proposal (2021).

  6. Flach, P. A. et al. Novel tools to streamline the conference review process: Experiences from sigkdd’09. ACM SIGKDD Explor. Newsl. 11, 63–67. https://doi.org/10.1145/1809400.1809413 (2010).

    Google Scholar 

  7. Protasiewicz, J. A support system for selection of reviewers. In 2014 IEEE International Conference on Systems, Man, and Cybernetics 3062–3065. https://doi.org/10.1109/SMC.2014.6974408 (IEEE, 2014).

  8. Di Mauro, N., Basile, T. M. A. & Ferilli, S. Grape: An expert review assignment component for scientific conference management systems. In 18th International Conference on Innovations in Applied Artificial Intelligence 789–798 (2005).

  9. Karimzadehgan, M., Zhai, C. & Belford, G. Multi-aspect expertise matching for review assignment. In 17th ACM Conference on Information and Knowledge Management (CIKM) 1113–1122 (2008).

  10. Charlin, L. & Zemel, R. The toronto paper matching system: An automated paper-reviewer assignment system (In ICML Workshop on Peer Reviewing and Publishing Models (2013).

  11. Zhao, S. et al. A novel classification method for paper-reviewer recommendation. Scientometrics 115, 1293–1313. https://doi.org/10.1007/s11192-018-2726-6 (2018).

    Google Scholar 

  12. Zhang, D. et al. A multi-label classification method using a hierarchical and transparent representation for paper-reviewer recommendation. arXiv preprint arXiv:1912.08976 (2019).

  13. Kou, N. M., Mamoulis, N. & Gong, Z. Weighted coverage based reviewer assignment. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data 2031–2046. https://doi.org/10.1145/2723372.2723727 (Association for Computing Machinery, 2015).

  14. Mirzaei, M., Sander, J. & Stroulia, E. Multi-aspect review-team assignment using latent research areas. Inf. Process. Manag. 56, 858–878. https://doi.org/10.1016/j.ipm.2019.01.007 (2019).

    Google Scholar 

  15. Yue, M., Tian, K. & Ma, T. An accurate and impartial expert assignment method for scientific project review. J. Data Inf. Sci. 2, 65–80. https://doi.org/10.1515/jdis-2017-0020 (2017).

    Google Scholar 

  16. Marco-Tordera, L., García-Costa, D. & Grimaldo, F. Cognitive Similarity Through Bibliometric Analysis (IOS Press, 2023).

  17. Zhang, Y., Yang, R., Jiao, S., Kang, S. & Han, J. Scientific paper retrieval with llm-guided semantic-based ranking. arXiv preprint arXiv:2505.21815https://doi.org/10.48550/arXiv.2505.21815 (2025).

  18. Mitrov, G., Stanoev, B., Gievska, S., Mirceva, G. & Zdravevski, E. Combining semantic matching, word embeddings, transformers, and llms for enhanced document ranking: Application in systematic reviews. Big Data Cognit. Comput. 8, 110. https://doi.org/10.3390/bdcc8090110 (2024).

    Google Scholar 

  19. Stergiopoulos, V., Vassilakopoulos, M., Tousidou, E., et al. & Corral, A. Conference management system utilizing an llm-based recommendation system for the reviewer assignment problem. In 27th International Conference on Enterprise Information Systems (ICEIS 2025). https://doi.org/10.5220/0013482600003929 (2025).

  20. Bagheri, F., Buscaldi, D. & Recupero, D. R. A study on content-based reviewer assignment in the semantic web and computer science domains. Computación y Sistemas28, https://doi.org/10.13053/cys-28-4-5299 (2024).

  21. AAAI. Aaai launches ai-powered peer review assessment system. https://aaai.org/aaai-launches-ai-powered-peer-review-assessment-system/ (2024).

  22. Doskaliuk, B., Zimba, O., Yessirkepov, M., Klishch, I. & Yatsyshyn, R. Artificial intelligence in peer review: Enhancing efficiency while preserving integrity. J. Korean Med. Sci. 40, e92. https://doi.org/10.3346/jkms.2025.40.e92 (2025).

    Google Scholar 

  23. Horbach, S. P. J. M. Pandemic publishing: Medical journals strongly speed up their publication process for covid-19. Quant. Sci. Stud. 1, 1056–1067. https://doi.org/10.1162/qss_a_00076 (2020).

    Google Scholar 

  24. Perlis, R. H. et al. Artificial intelligence in peer review. JAMA https://doi.org/10.1001/jama.2025.15827 (2025).

  25. Okasa, G. et al. A supervised machine learning approach for assessing grant peer review reports. Quant. Sci. Stud. https://doi.org/10.1162/qss.a.23 (2025).

    Google Scholar 

  26. Hoo, Z. H., Candlish, J. & Teare, D. What is an roc curve? (2017).

  27. Fawcett, T. An introduction to roc analysis. Pattern Recogn. Lett. 27, 861–874. https://doi.org/10.1016/j.patrec.2005.10.010 (2006).

    Google Scholar 

  28. Hanley, J. A. & McNeil, B. J. The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143, 29–36. https://doi.org/10.1148/radiology.143.1.7063747 (1982).

    Google Scholar 

  29. Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x (1948).

    Google Scholar 

  30. Manning, C. D., Raghavan, P. & Schütze, H. Introduction to Information Retrieval (Cambridge University Press, 2008).

  31. Commission, E. General Data Protection Regulation (gdpr): Regulation (eu) 2016/679 of the European Parliament and of the Council (Official Journal of the European Union, 2018).

  32. Haak, L. L., Fenner, M., Paglione, L., Pentz, E. & Ratner, H. Orcid: a system to uniquely identify researchers. Learned Publ. 25, 259–264. https://doi.org/10.1087/20120404 (2012).

  33. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. https://doi.org/10.48550/ARXIV.1810.04805 (2018).

  34. Language models are few-shot learners. https://doi.org/10.48550/ARXIV.2005.14165 (2020).

  35. Galactica: A large language model for science. https://doi.org/10.48550/ARXIV.2211.09085 (2022).

Download references

Acknowledgements

This work was partially supported by the Capgemini-University of Valencia Chair for Innovation in Software Development.

Disclaimer

All views expressed in this article are strictly those of the authors and may in no circumstances be regarded as an official position of the Research Executive Agency or the European Commission.

Funding

EA-G. is supported by the Spanish Ministry of Science, Innovation and Universities through the FPU doctoral fellowship program [grant number FPU21/00570].

Author information

Authors and Affiliations

  1. Department of Computer Science, University of Valencia, Burjassot, Spain

    Elena Álvarez-García, Daniel García-Costa & Francisco Grimaldo

  2. European Research Executive Agency (REA), European Commission, Brussels, Belgium

    Ilse De Waele

  3. Center for Evidence-based Medicine, University of Split School of Medicine, Split, Croatia

    Ana Marusic

Authors
  1. Elena Álvarez-García
    View author publications

    Search author on:PubMed Google Scholar

  2. Daniel García-Costa
    View author publications

    Search author on:PubMed Google Scholar

  3. Ilse De Waele
    View author publications

    Search author on:PubMed Google Scholar

  4. Ana Marusic
    View author publications

    Search author on:PubMed Google Scholar

  5. Francisco Grimaldo
    View author publications

    Search author on:PubMed Google Scholar

Contributions

EA-G designed the study, built the dataset, designed and executed the analysis and revised the manuscript. DG-C designed the study, built the dataset, executed the analysis and revised the manuscript. IW provided data and revised the manuscript. AM coordinated part of the data collection, designed the study and revised the manuscript. FG coordinated and designed the study, revised the analysis and revised the manuscript.

Corresponding author

Correspondence to Francisco Grimaldo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Álvarez-García, E., García-Costa, D., De Waele, I. et al. Expert assignment system based on natural language processing for Marie Sklodowska-Curie actions. Sci Rep (2026). https://doi.org/10.1038/s41598-026-37115-8

Download citation

  • Received: 23 June 2025

  • Accepted: 19 January 2026

  • Published: 27 January 2026

  • DOI: https://doi.org/10.1038/s41598-026-37115-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics