Multimodal deep learning for international investment arbitration outcome prediction and bilateral investment agreement negotiation strategy optimization

Wu, Hao; Xu, Jiajun

doi:10.1038/s41598-026-47149-7

Download PDF

Article
Open access
Published: 03 April 2026

Multimodal deep learning for international investment arbitration outcome prediction and bilateral investment agreement negotiation strategy optimization

Hao Wu¹ &
Jiajun Xu¹

Scientific Reports , Article number: (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

International investment arbitration has expanded at a remarkable pace over the past two decades, generating pressing demand for robust outcome prediction tools that can guide strategic decisions. This study presents a multimodal deep learning framework that fuses textual, numerical, and visual data to predict arbitration outcomes in the investor-state dispute settlement context. Our attention-based fusion architecture channels legal documents, macroeconomic indicators, and visual evidence through dedicated encoders capable of capturing intricate cross-modal dependencies that shape tribunal reasoning. Evaluated on 1,247 arbitration cases drawn from major international institutions, the multimodal model attains an overall accuracy of 86.7%, surpassing single-modality counterparts by 7.8% points and conventional machine learning baselines by 14.6% points. Feature importance analysis reveals that the quality of legal argumentation, dispute monetary value, and arbitrator panel composition rank among the most decisive determinants of outcomes. Beyond their technical value, these findings equip investors, host states, and legal counsel with evidence-based tools for strategic planning, while simultaneously foregrounding normative questions about fairness, transparency, and equitable access to predictive technologies in dispute resolution.

Data availability

The datasets generated and analyzed during the current study are available through a comprehensive Supplementary File designed to maximize research transparency and enable full replication. Our dataset comprises exclusively publicly available materials from official arbitration institution databases and does not include any confidential arbitration documents, sealed memorials, or proprietary case analysis. To ensure complete transparency regarding this distinction, we clarify that: (1) all 1,247 cases in our dataset were obtained from publicly accessible sources where the arbitration proceedings and awards have been officially published or disclosed by the respective institutions; (2) no sealed or confidential case materials were accessed or incorporated; and (3) the term “confidentiality restrictions” in our previous draft referred to our inability to redistribute copyrighted full-text documents rather than any use of non-public materials.Regarding full-text redistribution, we acknowledge that platforms such as italaw.com redistribute arbitration award texts under specific licensing arrangements with arbitration institutions. As an academic research project, we do not possess equivalent redistribution licenses for the complete corpus of award texts. However, researchers can readily obtain all original source documents from the publicly accessible databases listed below, using the case identifiers we provide.Publicly available arbitration case data can be accessed through the following official repositories: ICSID Cases Database (https://icsid.worldbank.org/cases/case-database), UNCITRAL Case Repository (https://uncitral.un.org), Permanent Court of Arbitration Case Repository (https://pca-cpa.org), Investment Treaty Arbitration Database (https://investmentpolicy.unctad.org/investment-dispute-settlement), and italaw Investment Treaty Arbitration (https://www.italaw.com).To enable complete replication of our analysis, we provide Supplementary File 1, a comprehensive replication package containing all materials necessary for reproducing our results. This file includes: (1) Complete Case List providing case identifiers for all 1,247 cases (ICSID case numbers, ICC references, LCIA numbers, etc.), case names and parties, arbitration institution and procedural rules, award dates and outcome classifications, and direct URLs to publicly accessible award documents where available; (2) Extracted Feature Dataset containing all 127 engineered features in CSV format for each case, including 45 textual features (BERT-derived semantic scores), 64 numerical features (case characteristics, financial data), and 18 visual features (CNN-extracted representations), enabling researchers to reproduce our model training without re-processing original documents; (3) Complete Source Code with Python scripts for data preprocessing, feature extraction, model architecture implementation, training procedures, and evaluation metrics, including all library dependencies and version specifications; (4) Model Specifications documenting complete hyperparameter configurations, training procedures, and random seeds; and (5) Replication Guide with step-by-step instructions for obtaining original documents from public databases, reproducing our preprocessing pipeline, training the multimodal fusion model, and validating results against our reported performance metrics.For researchers seeking to replicate our entire pipeline from original documents, we provide detailed data acquisition protocols specifying exactly which database queries and filters retrieve each case in our dataset. The feature extraction code processes standard arbitration award formats from the major institutions, enabling researchers to generate identical feature representations from the original documents. Model checkpoints (trained weights) are available upon reasonable request for academic research purposes, subject to completion of a data use agreement restricting use to non-commercial research. Researchers interested in collaboration or data access should contact the corresponding author at asd18103929689@163.com with specific details of intended use and institutional affiliation.

References

United Nations Conference on Trade and Development. Facts and figures on investor–State dispute settlement cases. IIA Issues Note, No. 3, 2024. (2024). https://unctad.org/publication/facts-and-figures-investor-state-dispute-settlement-cases.
United Nations Conference on Trade and Development. World Investment Report 2024: Investment facilitation and digital government (2024). https://unctad.org/publication/world-investment-report-2024.
Katz, D. M., Bommarito, M. J. & Blackman, J. A general approach for predicting the behavior of the Supreme Court of the United States. PLOS ONE. 12 (4), e0174698. https://doi.org/10.1371/journal.pone.0174698 (2017).
Google Scholar
Cleary Gottlieb. Five international arbitration trends and topics for 2024 (2024). https://www.clearygottlieb.com/news-and-insights/publication-listing/five-international-arbitration-trends-and-topics-for-2024.
Chalkidis, I. et al. LexGLUE: a benchmark dataset for legal language understanding in English. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 4310–4330 (2022). https://doi.org/10.18653/v1/2022.acl-long.297.
Vaswani, A. et al. Attention is all you need. Adv. Neural. Inf. Process. Syst. 30, 5998–6008 (2017).
Google Scholar
Medvedeva, M., Vols, M. & Wieling, M. Using machine learning to predict decisions of the European Court of Human Rights. Artif. Intell. Law. 28 (2), 237–266. https://doi.org/10.1007/s10506-019-09255-y (2020).
Google Scholar
Transnational Matters. Bilateral vs multilateral investment treaties: Key contrasts (2024). https://www.transnationalmatters.com/bilateral-vs-multilateral-treaties-bits-vs-mits/.
United Nations Conference on Trade and Development. International investment agreements navigator (2024). https://investmentpolicy.unctad.org/international-investment-agreements.
Pinsentmasons. Major changes crystallising in investor-state dispute settlement (2025). https://www.pinsentmasons.com/out-law/analysis/major-changes-investor-state-dispute-settlement.
American Bar Association. Using AI for predictive analytics in litigation (2024). https://www.americanbar.org/groups/senior_lawyers/resources/voice-of-experience/2024-october/using-ai-for-predictive-analytics-in-litigation/.
Brookings Institution. A first look at outcomes under the No Surprises Act arbitration process (2024). https://www.brookings.edu/articles/a-first-look-at-outcomes-under-the-no-surprises-act-arbitration-process/.
Enyo Law. ICC and LCIA arbitration statistics 2023: In-depth analysis and insights (2025). https://enyolaw.com/news/icc-and-lcia-arbitration-statistics-2023-in-depth-analysis-and-insights/.
Chartered Institute of Arbitrators. Numbers don’t lie: International commercial arbitration statistics 2023 (2024). https://www.ciarb.org/news-listing/numbers-don-t-lie-international-commercial-arbitration-statistics-2023/.
Aletras, N., Tsarapatsanis, D., Preoţiuc-Pietro, D. & Lampos, V. Predicting judicial decisions of the European Court of Human Rights: a natural language processing perspective. PeerJ Comput. Sci. 2, e93. https://doi.org/10.7717/peerj-cs.93 (2016).
Google Scholar
Zhong, H. et al. How does NLP benefit legal system: a summary of legal artificial intelligence. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 5218–5230 (2020). https://doi.org/10.18653/v1/2020.acl-main.466.
Veale, M. & Binns, R. Fairer machine learning in the real world: mitigating discrimination without collecting sensitive data. Big Data Soc. 4, 2. https://doi.org/10.1177/2053951717743530 (2017).
Schreuer, C. The ICSID Convention: A Commentary 2 (Cambridge University Press, 2009).
Smith, R. An overview of the Tesseract OCR engine. In Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Vol. 2 629–633 (IEEE, 2007). https://doi.org/10.1109/ICDAR.2007.4376991.
Lin, T. Y., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision 2980–2988 (2017). https://doi.org/10.1109/ICCV.2017.324.
Baltrušaitis, T., Ahuja, C. & Morency, L. P. Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41 (2), 423–443. https://doi.org/10.1109/TPAMI.2018.2798607 (2019).
Google Scholar
Lundberg, S. M. & Lee, S. I. A unified approach to interpreting model predictions. Adv. Neural. Inf. Process. Syst. 30, 4765–4774 (2017).
Google Scholar
Muthoo, A. Bargaining Theory with Applications (Cambridge University Press, 1999).
Ashley, K. D. Artificial Intelligence and Legal Analytics: New Tools for Law Practice in the Digital Age (Cambridge University Press, 2017). https://doi.org/10.1017/9781316761458
Surden, H. Machine learning and law. Wash. Law Rev. 89 (1), 87–115 (2014).
Google Scholar
Alschner, W. & Skougarevskiy, D. Mapping the universe of international investment agreements. J. Int. Econ. Law. 19 (3), 561–588. https://doi.org/10.1093/jiel/jgw056 (2016).
Google Scholar
Cui, Y., Jia, M., Lin, T. Y., Song, Y. & Belongie, S. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 9268–9277 (2019). https://doi.org/10.1109/CVPR.2019.00949.
Engstrom, D. F., Gelbach, J., Ho, D. E. & Sharkey, C. M. Government by algorithm: artificial intelligence in federal administrative agencies. Report for the Administrative Conference of the United States (2020). https://www-cdn.law.stanford.edu/wp-content/uploads/2020/02/ACUS-AI-Report.pdf.
United Nations Conference on Trade and Development. World Investment Report 2024: investment facilitation and digital government (2025). https://unctad.org/publication/world-investment-report-2024.
Liu, Z. et al. A ConvNet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 11976–11986 (2022). https://doi.org/10.1109/CVPR52688.2022.01167.
Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).
Devlin, J., Chang, M. W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics 4171–4186 (2019). https://doi.org/10.18653/v1/N19-1423.
Bahdanau, D., Cho, K. & Bengio, Y. Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations (ICLR 2015) (2015). https://arxiv.org/abs/1409.0473.
Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In International Joint Conference on Artificial Intelligence, Vol. 14 1137–1145 (1995).
Prechelt, L. Early stopping - but when? In Neural Networks: Tricks of the Trade 53–67 (Springer, 2012). https://doi.org/10.1007/978-3-642-35289-8_5.
Japkowicz, N. & Shah, M. Evaluating Learning Algorithms: A Classification Perspective (Cambridge University Press, 2011). https://doi.org/10.1017/CBO9780511921803.
Sokolova, M. & Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45 (4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002 (2009).
Google Scholar
Kaufmann-Kohler, G. & Potestà, M. Can the Mauritius Convention serve as a model for the reform of investor-state arbitration in connection with the introduction of a permanent investment tribunal or an appeal mechanism? Analysis and Roadmap (3rd ed.). Geneva Center for International Dispute Settlement (2020).
Ribeiro, M. T., Singh, S. & Guestrin, C. Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (2016). https://doi.org/10.1145/2939672.2939778.
Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable 2 (Springer, 2022). https://christophm.github.io/interpretable-ml-book/.
Dolzer, R. & Schreuer, C. Principles of International Investment Law 2 (Oxford University Press, 2012).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521 (7553), 436–444. https://doi.org/10.1038/nature14539 (2015).
Google Scholar
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1 (5), 206–215. https://doi.org/10.1038/s42256-019-0048-x (2019).
Google Scholar
Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Aletras, N. & Androutsopoulos, I. LEGAL-BERT: the muppets straight out of law school. In Findings of the Association for Computational Linguistics: EMNLP 2020 2898–2904 (2020). https://doi.org/10.18653/v1/2020.findings-emnlp.261.
Franck, S. D. Development and outcomes of investment treaty arbitration. Harv. Int. Law J. 50 (2), 435–489 (2009).
Google Scholar

Download references

Funding

No funding was received for this study. The research was carried out using institutional resources and publicly available data without external financial support.

Author information

Authors and Affiliations

School of Public Administration, Hohai University , Nanjing, 211100, Jiangsu, China
Hao Wu & Jiajun Xu

Authors

Hao Wu
View author publications
Search author on:PubMed Google Scholar
Jiajun Xu
View author publications
Search author on:PubMed Google Scholar

Contributions

Hao Wu conceptualized the research framework, designed the multimodal deep learning architecture, conducted the computational experiments, performed the statistical analysis, and drafted the manuscript. Hao Wu also developed the data preprocessing pipeline, implemented the attention-based fusion mechanisms, and carried out the feature importance analysis and model validation procedures.Jiajun Xu contributed to the theoretical framework development, participated in the literature review and background research, assisted with data collection and preprocessing, and provided critical review and revision of the manuscript. Jiajun Xu also contributed to the bilateral investment agreement analysis framework and supported the interpretation of legal and policy implications.Both authors collaborated on the research design, methodology development, results interpretation, and manuscript preparation. All authors read and approved the final manuscript for publication.

Corresponding author

Correspondence to Hao Wu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethics approval

This study was approved by the Research Ethics Committee of Hohai University (Ethics Approval Number: HHU-2024-REC-089). The work involved analysis of publicly available arbitration case documents and did not require informed consent, as no human subjects were directly involved. All case data were obtained from publicly accessible databases and institutional repositories in compliance with applicable data protection regulations. The study protocol adhered to ethical guidelines for legal research involving secondary data analysis.

AI usage disclosure

In accordance with the Scientific Reports policy on the use of artificial intelligence, we declare that no generative AI tools (such as ChatGPT, Claude, Bard, or similar large language models) were employed in the drafting, writing, editing, or revision of this manuscript. All text, analysis, and interpretations presented in this paper were produced entirely by the human authors. While the research itself concerns deep learning and artificial intelligence methods applied to legal prediction, the manuscript preparation process did not involve any AI-assisted writing or content generation tools.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (download DOCX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wu, H., Xu, J. Multimodal deep learning for international investment arbitration outcome prediction and bilateral investment agreement negotiation strategy optimization. Sci Rep (2026). https://doi.org/10.1038/s41598-026-47149-7

Download citation

Received: 02 August 2025
Accepted: 30 March 2026
Published: 03 April 2026
DOI: https://doi.org/10.1038/s41598-026-47149-7

Multimodal deep learning for international investment arbitration outcome prediction and bilateral investment agreement negotiation strategy optimization

Subjects

Abstract

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethics approval

AI usage disclosure

Additional information

Publisher’s note

Supplementary Information

Supplementary Material 1 (download DOCX )

Rights and permissions

About this article

Cite this article

Keywords

Search

Quick links

Subjects

Abstract

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethics approval

AI usage disclosure

Additional information

Publisher’s note

Supplementary Information

Supplementary Material 1 (download DOCX )

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links