Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
LingualX64: a multilingual benchmark for evaluating symmetry and asymmetry in LLM translation
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 26 April 2026

LingualX64: a multilingual benchmark for evaluating symmetry and asymmetry in LLM translation

  • Yan Huang1,
  • Wei Liu1,
  • Jiayi Wang1 na1 &
  • …
  • Huidong Zhu1 na1 

Scientific Reports (2026) Cite this article

  • 623 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Language and linguistics
  • Mathematics and computing

Abstract

Large Language Models (LLMs) have revolutionized Natural Language Processing, including machine translation (MT), achieving unprecedented performance. However, this progress masks underlying asymmetries in training data and model architecture that impact multilingual translation quality. This paper introduces LingualX64, a novel dataset spanning 64 languages, designed to evaluate the extent to which these asymmetries affect LLM translation performance, particularly under zero-shot conditions. LingualX64 is constructed to minimize data overlap with existing LLM training corpora and to provide a balanced representation of diverse linguistic features, enabling a more robust assessment of cross-linguistic generalization. Our evaluation reveals significant performance disparities across languages, highlighting the impact of data scarcity and linguistic complexity on translation quality. These findings underscore the need for strategies to mitigate asymmetries in LLM training and model design to achieve more equitable and robust multilingual translation capabilities. LingualX64 provides a valuable benchmark for researchers and developers seeking to address these challenges and unlock the full potential of LLMs for global communication.

Similar content being viewed by others

Linguistic features of AI mis/disinformation and the detection limits of LLMs

Article Open access 11 December 2025

Large language models show Dunning-Kruger-like effects in multilingual fact-checking

Article Open access 25 February 2026

Testing AI on language comprehension tasks reveals insensitivity to underlying meaning

Article Open access 14 November 2024

Funding

This research was funded by the Henan Science and Technology Research Project, Zhengzhou, China (242102211060).

Author information

Author notes
  1. These authors contributed equally: Jiayi Wang and Huidong Zhu.

Authors and Affiliations

  1. Zhengzhou University of Light Industry, Zhengzhou, 450000, China

    Yan Huang, Wei Liu, Jiayi Wang & Huidong Zhu

Authors
  1. Yan Huang
    View author publications

    Search author on:PubMed Google Scholar

  2. Wei Liu
    View author publications

    Search author on:PubMed Google Scholar

  3. Jiayi Wang
    View author publications

    Search author on:PubMed Google Scholar

  4. Huidong Zhu
    View author publications

    Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Huidong Zhu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Language

See Table 4.

Table 4 We list the ISO code, language name, language, alphabet and resource level for each language39.
Full size table

Score

See Tables 5, 6, 7 and 8.

Table 5 BLEU scores on xx\({\Rightarrow }\)en.
Full size table
Table 6 BLEU scores on xx\({\Rightarrow }\)zh.
Full size table
Table 7 COMET scores on xx\({\Rightarrow }\)en.
Full size table
Table 8 COMET scores on xx\({\Rightarrow }\)zh.
Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, Y., Liu, W., Wang, J. et al. LingualX64: a multilingual benchmark for evaluating symmetry and asymmetry in LLM translation. Sci Rep (2026). https://doi.org/10.1038/s41598-026-49738-y

Download citation

  • Received: 10 January 2026

  • Accepted: 16 April 2026

  • Published: 26 April 2026

  • DOI: https://doi.org/10.1038/s41598-026-49738-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Machine translation
  • Large language model
  • Multilingual processing
  • Natural language processing
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics