Abstract
Assigning experts to project proposals is a critical process in research evaluation. Traditional Information Retrieval (IR) methods, such as the Single Evaluation Platform (SEP) used by the European Research Executive Agency, automatically assign experts based on keyword matching, but these assignments are subsequently reviewed and corrected by Vice Chairs (VCs) to ensure suitability. To address the limitations of keyword-based systems and enhance semantic relevance, we developed a novel expert assignment system leveraging Natural Language Processing with Large Language Models (LLMs). Our approach integrates dynamic retrieval of expert publications via ORCID with GALACTICA, a specialized scientific LLM, to compute fine-grained semantic similarity between publications and proposal abstracts. Using a dataset of 48 experts and 181 proposals, we evaluated three similarity aggregation strategies: Sum, Product, and Maximum. The Maximum similarity approach most closely replicated VCs-reviewed assignments, achieving an AUC of 0.82, significantly outperforming the traditional SEP system (AUC = 0.75), Sum (AUC = 0.69), and Product (AUC = 0.57). These results demonstrate that focusing on the single most relevant match effectively captures human decision-making, highlighting the potential of LLM-based semantic matching to provide a more accurate and scalable alternative to existing IR systems. Furthermore, unlike SEP’s discrete affinity scores, our aggregation strategies produce highly discriminative, fine-grained ratings, allowing for more nuanced differentiation among candidate experts.
Similar content being viewed by others
Data availability
The data for replication are available at this link: https://dataverse.harvard.edu/previewurl.xhtml?token=9ddf76ed-c05f-4b96-a2da-f1f4ad79dfcd
References
Zhao, X. & Zhang, Y. Reviewer assignment algorithms for peer review automation: A survey. Inf. Process. Manag. 59, 103028. https://doi.org/10.1016/j.ipm.2022.103028 (2022).
Commission, E. Marie skłodowska-curie actions work programme 2018-2020 (2018). https://ec.europa.eu/research/participants/data/ref/h2020/wp/2018-2020/main/h2020-wp1820-msca_en.pdf.
Baumert, P., Cenni, F. & Ten Antonkine, M. L. simple rules for a successful eu marie skłodowska-curie actions postdoctoral (msca) fellowship application. PLoS Comput. Biol. 18, e1010371. https://doi.org/10.1371/journal.pcbi.1010371 (2022).
Commission, E. Postdoctoral fellowships - marie skłodowska-curie actions.
European Commission, R. & Portal, I. F. T. Evaluate a proposal—it how to—evaluation tool. https://webgate.ec.europa.eu/funding-tenders-opportunities/display/IT/Evaluate+a+proposal (2021).
Flach, P. A. et al. Novel tools to streamline the conference review process: Experiences from sigkdd’09. ACM SIGKDD Explor. Newsl. 11, 63–67. https://doi.org/10.1145/1809400.1809413 (2010).
Protasiewicz, J. A support system for selection of reviewers. In 2014 IEEE International Conference on Systems, Man, and Cybernetics 3062–3065. https://doi.org/10.1109/SMC.2014.6974408 (IEEE, 2014).
Di Mauro, N., Basile, T. M. A. & Ferilli, S. Grape: An expert review assignment component for scientific conference management systems. In 18th International Conference on Innovations in Applied Artificial Intelligence 789–798 (2005).
Karimzadehgan, M., Zhai, C. & Belford, G. Multi-aspect expertise matching for review assignment. In 17th ACM Conference on Information and Knowledge Management (CIKM) 1113–1122 (2008).
Charlin, L. & Zemel, R. The toronto paper matching system: An automated paper-reviewer assignment system (In ICML Workshop on Peer Reviewing and Publishing Models (2013).
Zhao, S. et al. A novel classification method for paper-reviewer recommendation. Scientometrics 115, 1293–1313. https://doi.org/10.1007/s11192-018-2726-6 (2018).
Zhang, D. et al. A multi-label classification method using a hierarchical and transparent representation for paper-reviewer recommendation. arXiv preprint arXiv:1912.08976 (2019).
Kou, N. M., Mamoulis, N. & Gong, Z. Weighted coverage based reviewer assignment. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data 2031–2046. https://doi.org/10.1145/2723372.2723727 (Association for Computing Machinery, 2015).
Mirzaei, M., Sander, J. & Stroulia, E. Multi-aspect review-team assignment using latent research areas. Inf. Process. Manag. 56, 858–878. https://doi.org/10.1016/j.ipm.2019.01.007 (2019).
Yue, M., Tian, K. & Ma, T. An accurate and impartial expert assignment method for scientific project review. J. Data Inf. Sci. 2, 65–80. https://doi.org/10.1515/jdis-2017-0020 (2017).
Marco-Tordera, L., García-Costa, D. & Grimaldo, F. Cognitive Similarity Through Bibliometric Analysis (IOS Press, 2023).
Zhang, Y., Yang, R., Jiao, S., Kang, S. & Han, J. Scientific paper retrieval with llm-guided semantic-based ranking. arXiv preprint arXiv:2505.21815https://doi.org/10.48550/arXiv.2505.21815 (2025).
Mitrov, G., Stanoev, B., Gievska, S., Mirceva, G. & Zdravevski, E. Combining semantic matching, word embeddings, transformers, and llms for enhanced document ranking: Application in systematic reviews. Big Data Cognit. Comput. 8, 110. https://doi.org/10.3390/bdcc8090110 (2024).
Stergiopoulos, V., Vassilakopoulos, M., Tousidou, E., et al. & Corral, A. Conference management system utilizing an llm-based recommendation system for the reviewer assignment problem. In 27th International Conference on Enterprise Information Systems (ICEIS 2025). https://doi.org/10.5220/0013482600003929 (2025).
Bagheri, F., Buscaldi, D. & Recupero, D. R. A study on content-based reviewer assignment in the semantic web and computer science domains. Computación y Sistemas28, https://doi.org/10.13053/cys-28-4-5299 (2024).
AAAI. Aaai launches ai-powered peer review assessment system. https://aaai.org/aaai-launches-ai-powered-peer-review-assessment-system/ (2024).
Doskaliuk, B., Zimba, O., Yessirkepov, M., Klishch, I. & Yatsyshyn, R. Artificial intelligence in peer review: Enhancing efficiency while preserving integrity. J. Korean Med. Sci. 40, e92. https://doi.org/10.3346/jkms.2025.40.e92 (2025).
Horbach, S. P. J. M. Pandemic publishing: Medical journals strongly speed up their publication process for covid-19. Quant. Sci. Stud. 1, 1056–1067. https://doi.org/10.1162/qss_a_00076 (2020).
Perlis, R. H. et al. Artificial intelligence in peer review. JAMA https://doi.org/10.1001/jama.2025.15827 (2025).
Okasa, G. et al. A supervised machine learning approach for assessing grant peer review reports. Quant. Sci. Stud. https://doi.org/10.1162/qss.a.23 (2025).
Hoo, Z. H., Candlish, J. & Teare, D. What is an roc curve? (2017).
Fawcett, T. An introduction to roc analysis. Pattern Recogn. Lett. 27, 861–874. https://doi.org/10.1016/j.patrec.2005.10.010 (2006).
Hanley, J. A. & McNeil, B. J. The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143, 29–36. https://doi.org/10.1148/radiology.143.1.7063747 (1982).
Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x (1948).
Manning, C. D., Raghavan, P. & Schütze, H. Introduction to Information Retrieval (Cambridge University Press, 2008).
Commission, E. General Data Protection Regulation (gdpr): Regulation (eu) 2016/679 of the European Parliament and of the Council (Official Journal of the European Union, 2018).
Haak, L. L., Fenner, M., Paglione, L., Pentz, E. & Ratner, H. Orcid: a system to uniquely identify researchers. Learned Publ. 25, 259–264. https://doi.org/10.1087/20120404 (2012).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. https://doi.org/10.48550/ARXIV.1810.04805 (2018).
Language models are few-shot learners. https://doi.org/10.48550/ARXIV.2005.14165 (2020).
Galactica: A large language model for science. https://doi.org/10.48550/ARXIV.2211.09085 (2022).
Acknowledgements
This work was partially supported by the Capgemini-University of Valencia Chair for Innovation in Software Development.
Disclaimer
All views expressed in this article are strictly those of the authors and may in no circumstances be regarded as an official position of the Research Executive Agency or the European Commission.
Funding
EA-G. is supported by the Spanish Ministry of Science, Innovation and Universities through the FPU doctoral fellowship program [grant number FPU21/00570].
Author information
Authors and Affiliations
Contributions
EA-G designed the study, built the dataset, designed and executed the analysis and revised the manuscript. DG-C designed the study, built the dataset, executed the analysis and revised the manuscript. IW provided data and revised the manuscript. AM coordinated part of the data collection, designed the study and revised the manuscript. FG coordinated and designed the study, revised the analysis and revised the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Álvarez-García, E., García-Costa, D., De Waele, I. et al. Expert assignment system based on natural language processing for Marie Sklodowska-Curie actions. Sci Rep (2026). https://doi.org/10.1038/s41598-026-37115-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-37115-8


