Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Humanities and Social Sciences Communications
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. humanities and social sciences communications
  3. articles
  4. article
Evaluating literary translation by large language models: a multidimensional quality assessment of Shen Congwen’s Border Town
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 14 March 2026

Evaluating literary translation by large language models: a multidimensional quality assessment of Shen Congwen’s Border Town

  • Wei Yang1 &
  • Mingxing Yang1 

Humanities and Social Sciences Communications , Article number:  (2026) Cite this article

  • 993 Accesses

  • 1 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Cultural and media studies
  • Language and linguistics
  • Literature

Abstract

Large language models (LLMs) have exhibited remarkable abilities in understanding and generating human language, which is applied in transferring languages. However, the translation of literary works presents unique challenges. The translation quality of literary works generated by LLMs is yet to be explored and tested. Therefore, this study aims to evaluate the quality of translations produced by various LLMs in comparison to a well-established human-translated work. The famous Chinese literary work Border Town by Shen Congwen was selected as the source text. ChatGPT 4, ChatGPT 4o, WXYY 4.0 Turbo, and Gemini were adopted as the models to process the translation. Jeffrey Kinkley’s translation was chosen as the human translation for comparison. This research employs Multidimensional Quality Metrics to evaluate translation quality by providing detailed error typologies. We focused on error analysis from three key dimensions of translation quality: accuracy, fidelity, and cultural appropriateness. The results showed that five types of errors were identified: mistranslation, omission, over-translation, cultural mistranslation, and discourse-level errors. Mistranslation has the top frequency in all models, omission occurs the most in Gemini, over translation and cultural mistranslation appear the most in GPT4. Discourse-level error occurred in WXYY 4.0 turbo the most. GPT-4o appears to yield comparatively higher translation quality under the MQM framework. The research reveals that literary translation by LLMs requires more specific training prompt strategies and human post-editing to improve its accuracy, fidelity, and cultural appropriateness.

Similar content being viewed by others

Writing without borders: AI and cross-cultural convergence in academic writing quality

Article Open access 09 July 2025

A multi agent classical Chinese translation method based on large language models

Article Open access 17 November 2025

The AI interviewer: multi-faceted evaluation of adaptive questioning by large language models

Article Open access 04 April 2026

Data availability

The materials analyzed in this study include excerpts from Shen Congwen’s Border Town, the large language model–generated translations, and the corresponding MQM-based error annotation data. Due to copyright restrictions on the original literary text, the source text excerpts cannot be publicly shared. The LLM-generated outputs, annotated evaluation data, coding manual, and other related files are available from the corresponding author upon reasonable request.

References

  • Chen X (2019) When translation meets psychoanalysis: a study in contemporary Chinese literary translation [Doctoral thesis]. State University of New York

  • Fernandes P, Deutsch D, Finkelstein M, Riley P, Martins AFT, Neubig G, Garg A, Clark J, Freitag M, Firat O (2023) The devil is in the errors: leveraging large language models for fine-grained machine translation evaluation. In: Conference on machine translation - proceedings. https://doi.org/10.48550/arXiv.2308.07286

  • Freitag M, Foster G, Grangier D, Ratnakar V, Uszkoreit J (2021) Experts, errors, and context: A large-scale study of human evaluation for machine translation. Trans Assoc Comput Linguist. https://doi.org/10.1162/tacl_a_00437

  • Guo X, Ang LH, Rashid MS, Ser WH (2020) The translator’s voice through the translation of characters’ names in Bian Cheng. Southeast Asian J Engl Lang Stud 26:81–95

    Google Scholar 

  • He Z, Liang T, Jiao W, Zhang Z, Yang Y (2024) Exploring human-like translation strategy with large language models. Trans Assoc Comput Linguist. https://doi.org/10.1162/tacl_a_00642

  • Hermans T (2007) Literary translation. In: Kuhiwczak P, Littau K (eds), A companion to translation studies. Multilingual Matters, p 77–91 https://translationjournal.net/journal/45review.htm

  • Hu F (2019) A study on English translations of Bian Cheng from the perspective of imagology [Master thesis]. Shanghai International Studies University

  • Jiao W, Huang J-T, Wang W, He Z, Wang X, Tu Z (2023) ParroT: Translating during chat using large language models. In Findings of the Association for Computational Linguistics: EMNLP 2023. pp.15009−15020. https://doi.org/10.18653/v1/2023.findings-emnlp.1001

  • Klubicka F, Toral A, Sánchez-Cartagena V (2018) Quantitative fine-grained human evaluation of machine translation systems: a case study on English to Croatian. Mach Transl 32:195–215. https://doi.org/10.1007/s10590-018-9214-x

    Google Scholar 

  • Kocmi T, Bojar O, Federmann C, Graham Y, Grundkiewicz R, Haddow B, Zampieri M (2025) Findings of the 2025 conference on machine translation (WMT25). In: Proceedings of the eighth conference on machine translation (WMT)

  • Kocmi T, Federmann C (2023) Large language models are state-of-the-art evaluators of translation quality. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation. pp. 193-203. https://aclanthology.org/2023.eamt-1.19/

  • Li J, Zhou H, Huang S, Cheng S (2024) Eliciting the translation ability of large language models via multilingual finetuning with translation instructions. Trans Assoc Comput Linguist. https://doi.org/10.1162/tacl_a_00655

  • Liu L (2019) A corpus-based study of the translator’s style in four English translations of Biancheng [Master Thesis]. East China University of Science and Technology

  • Liu N (2023) A contrastive study of three English versions of Biancheng from the perspective of corpus-based critical translation studies. Modern Linguistics. pp. 145–157. https://doi.org/10.12677/ML.2023.111021

  • Lommel A (2018) Metrics for translation quality assessment: a case for standardising error typologies. In: Moorkens J, Castilho S, Gaspari F, Doherty S (eds) Translation quality assessment. machine translation: technologies and applications, Vol 1. Springer, Cham. https://doi.org/10.1007/978-3-319-91241-7_6

  • Lu Q, Qiu B, Ding L, Xie L, Tao D (2023) Error analysis prompting enables human-like translation evaluation in large language models: a case study on ChatGPT. In Findings of the Association for Computational Linguistics: ACL 2024. pp. 8801−8816. https://doi.org/10.18653/v1/2024.findings-acl.520

  • Ma R (2022) China’s image in Jefferey Kinkley’s translation of Border Town: An imagological approach [Master thesis]. Beijing Foreign Studies University

  • Mikoyan A (2019) Understanding in literary translation. Armenian Folia Anglistika. 64–85. https://doi.org/10.46991/afa/2019.15.2.064

  • Ponzio A (2007) Translation and the literary text. TTR 20:89–119. https://doi.org/10.7202/018823AR

    Google Scholar 

  • Sai A, Nagarajan V, Dixit T, Dabre R, Kunchukuttan A, Kumar P, Khapra M (2023) IndicMT Eval: a dataset to meta-evaluate machine translation metrics for indian languages. In: Proceedings of the 61st annual meeting of the association for computational linguistics (Volume 1: Long Papers), https://doi.org/10.18653/v1/2023.acl-long.795

  • Sun S (2015) Measuring translation difficulty: theoretical and methodological considerations. Across Lang Cult 16:29–54. https://doi.org/10.1556/084.2015.16.1.2

    Google Scholar 

  • Thompson G, Dooley K (2019) Ensuring translation fidelity in multilingual research. In: The Routledge handbook of research methods in applied linguistics. Routledge, p 63–75. https://doi.org/10.4324/9780367824471-6

  • Venuti L (1995) The translator’s invisibility: a history of translation. In: The translator’s invisibility. Routledge. https://doi.org/10.4324/9780203360064

  • Wang L, Lyu, C, Ji, T, Zhang, Z, Yu D, Shi S, Tu Z (2023) Document-level machine translation with large language models. In: Proceedings of the 2023 conference on empirical methods in natural language processing. https://doi.org/10.18653/v1/2023.emnlp-main.1036

  • Wang Z (2014) The translator’s subjectivity in literary translation. Comp Lit 19:96–111. https://doi.org/10.1080/25723618.2014.12015489

    Google Scholar 

  • Wright C (2016) Literary translation, 1st edn. Routledge. https://doi.org/10.4324/9781315643694

  • Xu H, Kim YJ, Sharaf A, Awadalla HH (2024) A paradigm shift in machine translation: boosting translation performance of large language models. Preprint at https://arxiv.org/abs/2309.11674

  • Xu M (2012) On scholar translators in literary translation–a case study of Kinkley’s translation of ‘Biancheng. Perspect Stud Translatol 20:151–163. https://doi.org/10.1080/0907676X.2011.554610

    Google Scholar 

  • Xu M (2019) Translation of modern Chinese literature in America: an interview with Jeffrey C. Kinkley. ARIEL 50:127–138. https://doi.org/10.1353/ari.2019.0036

  • Xu M, Yu J (2019) Sociological formation and reception of translation: the case of Kinkley’s translation of Biancheng. Transl Interpret Stud 14:333–350. https://doi.org/10.1075/tis.19039.xu

    Google Scholar 

  • Zhang B, Haddow B, Birch A (2023) Prompting large language models for machine translation: a case study. In: Proceedings of the 40th international conference on machine learning (ICML'23). https://doi.org/10.5555/3618408.3620130

  • Zhu W, Liu H, Dong Q, Xu J, Kong L, Chen J, Huang S (2023) Multilingual machine translation with large language models: Empirical results and analysis. https://doi.org/10.48550/arXiv.2304.04675

  • Zuo Y, Ching GS, Khotsing R (2024) The application of ChatGPT in literary translation: a case study from Thai to Chinese. In: Uden L, Liberona D (eds) Learning technology for education challenges. LTEC 2024. Communications in computer and information science, vol 2082. Springer, Cham. https://doi.org/10.1007/978-3-031-61678-5_24

Download references

Author information

Authors and Affiliations

  1. Moutai Institute, Zunyi, China

    Wei Yang & Mingxing Yang

Authors
  1. Wei Yang
    View author publications

    Search author on:PubMed Google Scholar

  2. Mingxing Yang
    View author publications

    Search author on:PubMed Google Scholar

Contributions

In terms of author contributions, WY wrote the main manuscript text, MXY prepared tables, and both conducted the analysis of the translation errors. All authors reviewed and approved the final manuscript.

Corresponding author

Correspondence to Mingxing Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

This study analyzes literary texts and machine-generated translations and does not involve human participants, personal data, or animal subjects. Therefore, ethical approval was not required.

Informed consent

This article does not contain any studies with human participants performed by any of the authors therefore, informed consent was not required.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, W., Yang, M. Evaluating literary translation by large language models: a multidimensional quality assessment of Shen Congwen’s Border Town. Humanit Soc Sci Commun (2026). https://doi.org/10.1057/s41599-026-06868-y

Download citation

  • Received: 25 February 2025

  • Accepted: 24 February 2026

  • Published: 14 March 2026

  • DOI: https://doi.org/10.1057/s41599-026-06868-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Journal Information
  • Referee instructions
  • Editor instructions
  • Journal policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Events
  • Contact

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Humanities and Social Sciences Communications (Humanit Soc Sci Commun)

ISSN 2662-9992 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited