Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Humanities and Social Sciences Communications
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. humanities and social sciences communications
  3. articles
  4. article
Bridging human judgment and AI precision: a step toward intercultural competence in text refinement
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 03 February 2026

Bridging human judgment and AI precision: a step toward intercultural competence in text refinement

  • Yicheng Sun1,
  • Hanbo Yang1,
  • Yi Wang1 &
  • …
  • Richard Suen2 

Humanities and Social Sciences Communications , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Language and linguistics
  • Science, technology and society

Abstract

Human writing often exhibits a variety of styles and levels of sophistication, yet automated text generation systems typically struggle to produce nuanced and culturally sensitive prose. Achieving a balance between AI-driven automated generation and human judgment is essential for refining text in ways that respect diverse cultural contexts. This study addresses the challenges inherent in text refinement, a task that is complex due to the one-to-many relationship between inputs and outputs in natural language generation, making annotation consistency difficult. Our research proposes a semi-automatic data construction method that combines the strengths of both AI and human judgment to generate more elegant expressions while preserving the original semantics and cultural relevance of the input sentences. Initially, the method employs back translation to convert elegant expressions into more neutral ones, followed by an iterative quality control process. This process involves data filtering and human judgment to ensure that the automated generated text adheres to cultural norms and quality standards. By involving minimal human effort in each iteration, this approach significantly reduces the annotation workload while producing a large-scale, high-quality dataset for text refinement. Ultimately, this method contributes to the development of more culturally aware AI systems that facilitate ethical and effective intercultural communication in the age of globalization.

Data availability

Data will be made available upon reasonable request. Access requests should be directed to Dr. Yicheng Sun, subject to ethical approval and data protection requirements.No datasets were generated or analyzed during the current study.

References

  • Al Farisi MZ, Maulani H (2024) Machine translation shifts on the meaning equivalence of culture sentence and illocutionary speech acts: Back-translation. CaLLs J Cult Arts Lit Linguist 10(1):1–16

    Google Scholar 

  • Almahameed Y (2020) A stylistic analysis of the short story The Little Match Girl. Int J Innov Creat Change 14(12):1229–1240

    Google Scholar 

  • Barnes AJ, Zhang Y, Valenzuela A (2024) AI and culture: culturally dependent responses to AI systems. Curr Opin Psychol 58:101838

    Google Scholar 

  • Bautista D, Atienza R (2022) Scene text recognition with permuted autoregressive sequence models. In European conference on computer vision Cham. Springer Nature, Switzerland, pp 178–196

  • Behr D (2017) Assessing the use of back translation: the shortcomings of back translation as a quality testing method. Int J Soc Res Methodol 20(6):573–584

    Google Scholar 

  • Bonaldi H, Chung Y-L, Abercrombie G, Guerini M (2024) NLP for counterspeech against hate: a survey and how-to guide. In Findings of the Association for Computational Linguistics: NAACL 2024. Association for Computational Linguistics, Mexico City, Mexico, pp 3480–3499

  • Chowdhary K (2020) Natural language processing. In Fundamentals of artificial intelligence. New Delhi, India, Springer, pp 603–649

  • Cui Y, Che W, Liu T, Qin B, Wang S, Hu G (2020) Revisiting pre-trained models for Chinese natural language processing. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 657–668

  • Desmond M, Duesterwald E, Brimijoin K, Brachman M, Pan Q (2021) Semi-automated data labeling. In NeurIPS 2020 competition and demonstration track. PMLR, pp 156–169

  • Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the Association for Computational Linguistics: human language technologies, Minneapolis, MN, US Volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, MN, USA, pp 4171–4186

  • Duch W (2000) Similarity-based methods: a general framework for classification, approximation and association. Control and Cybernetics 29(4):937–967

  • Frontull S, Moser G (2024) Rule-based, neural and LLM back-translation: Comparative insights from a variant of Ladin. In Proceedings of the seventh workshop on technologies for machine translation of low-resource languages (LoResMT 2024). Association for Computational Linguistics, Bangkok, Thailand, pp 128–138

  • Gao G, Taymanov A, Salinas E, Mineiro P, Misra D (2024a) Aligning LLM agents by learning latent preference from user edits. In Advances in neural information processing systems, vol 37. pp 136873–136896

  • Gao P, Sun N, Wang X, Yang C, Zitikis R (2024b) Natural language processing-based detection of systematic anomalies among the narratives of consumer complaints. J Operation Risk 19(2):79–104

  • Ghaddar A, Wu Y, Bagga S, Rashid A, Bibi K, Rezagholizadeh M et al. (2022) Revisiting pre-trained language models and their evaluation for Arabic natural language processing. In: Proceedings of the 2022 conference on empirical methods in natural language processing. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, pp 3135–3151

  • Gu Y, Zhang Z, Wang X, Liu Z, Sun M (2020) Train no evil: selective masking for task-guided pre-training. In Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). Online: Association for Computational Linguistics, pp 6966–6974

  • Guo Q, Cao J, Xie X, Liu S, Li X, Chen B et al. (2024) Exploring the potential of ChatGPT in automated code refinement: an empirical study. In: Proceedings of the 46th IEEE/ACM international conference on software engineering. IEEE Computer Society / ACM Lisbon, Portugal, pp 1–13

  • Heelas P (2024) Emotion talk across cultures. In H. Wulff (Ed.), The Emotions. London, UK: Routledge pp 31–36

  • Hong K, Han L, Batista-Navarro RT, Nenadic G (2024) Cantonmt: Cantonese to english nmt platform with fine-tuned models using real and synthetic back-translation data. In Proceedings of the 25th annual conference of the European Association for Machine Translation, vol 1. Oxford University Press New York, NY, USA, pp 590–599

  • Hu S, Ding N, Wang H, Liu Z, Wang J, Li J et al. (2022) Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. In: Proceedings of the 60th annual meeting of the Association for Computational Linguistics, vol 1: long papers. Association for Computational Linguistics, Dublin, Ireland, pp 2225–2240

  • Jin D, Jin Z, Hu Z, Vechtomova O, Mihalcea R (2022) Deep learning for text style transfer: a survey. Comput Linguist 48(1):155–205

    Google Scholar 

  • Kang Y, Cai Z, Tan C-W, Huang Q, Liu H (2020) Natural language processing (nlp) in management research: a literature review. J Manag Anal 7(2):139–172

    Google Scholar 

  • Kashyap K, Sarma SK, Ahmed MA (2024) Improving translation between English, Assamese bilingual pair with monolingual data, length penalty and model averaging. Int J Inf Technol 16(3):1539–1549

    Google Scholar 

  • Katinskaia A, Yangarber R (2021) Assessing grammatical correctness in language learning. In Proceedings of the 16th workshop on innovative use of NLP for building educational applications. Online: Association for Computational Linguistics pp 135–146

  • Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O et al. (2020) Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th annual meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics pp 7871–7880

  • Li J, Galley M, Brockett C, Gao J, Dolan WB (2016) A diversity-promoting objective function for neural conversation models. In: Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics San Diego, CA, USA, pp 110–119

  • Li S, Kou P, Ma M, Yang H, Huang S, Yang Z (2024) Application of semi-supervised learning in image classification: research on fusion of labeled and unlabeled data. IEEE Access 12:27331–27343

    Google Scholar 

  • Liu C-W, Lowe R, Serban IV, Noseworthy M, Charlin L, Pineau J (2016) How not to evaluate your dialogue system: an empirical study of unsupervised evaluation metrics for dialogue response generation. In Proceedings of the 2016 conference on empirical methods in natural language processing. Association for Computational Linguistics. Austin, TX, USA, pp 2122–2132

  • Lu Z, Zhou A, Ren H, Wang K, Shi W, Pan J et al. (2024) Mathgenie: generating synthetic data with question back-translation for enhancing mathematical reasoning of LLMs. In Proceedings of the 62nd annual meeting of the Association for Computational Linguistics, vol 1: Long Papers. pp 2732–2747

  • Ma D, Akram H, Chen I-H (2024) Artificial intelligence in higher education: a cross-cultural examination of students’ behavioral intentions and attitudes. Int Rev Res Open Distrib Learn 25(3):134–157

    Google Scholar 

  • Madaan A, Tandon N, Gupta P, Hallinan S, Gao L, Wiegreffe S (2023) Self-refine: iterative refinement with self-feedback. In: Advances in neural information processing systems, vol 36. New Orleans, LA, USA: Curran Associates, Inc. pp 46534–46594

  • Madaan A, Tandon N, Gupta P, Hallinan S, Gao L, Wiegreffe S et al. (2024) Self-refine: iterative refinement with self-feedback. In: Advances in neural information processing systems, vol 36. Curran Associates, Inc. New Orleans, LA, USA

  • Maharana K, Mondal S, Nemade B (2022) A review: data pre-processing and data augmentation techniques. Glob Transit Proc 3(1):91–99

    Google Scholar 

  • Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, vol 26. Neural Information Processing Systems Foundation, Inc. / Curran Associates, Inc. Lake Tahoe, NV, USA

  • Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics. Doha, Qatar, pp 1532–1543

  • Prabhumoye S, Tsvetkov Y, Salakhutdinov R, Black AW (2018) Style transfer through back-translation. In Proceedings of the 56th annual meeting of the Association for Computational Linguistics, vol 1: Long Papers. Association for Computational Linguistics Melbourne, Australia, pp 866–876

  • Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training (Technical report). OpenAI. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf

  • Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67

    Google Scholar 

  • Ren H, Li Z, Cai Y, Tan X, Wu X (2023) Learning refined features for open-world text classification with class description and commonsense knowledge. World Wide Web 26(2):637–660

    Google Scholar 

  • Saha D, Tarek S, Yahyaei K, Saha SK, Zhou J, Tehranipoor M et al. (2024) LLM for soc security: a paradigm shift. IEEE Access, 12:155498–155521

  • Sennrich R, Haddow B, Birch A (2016) Improving neural machine translation models with monolingual data. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics, vol 1: long papers. Association for Computational Linguistics pp 86–96

  • Sun Y, Yang H, Yu HK, Suen R (2025) Boon or bane? evaluating AI-driven learning assistance in higher education professional coursework. Educ Inf Technol 30(15):22011–22044

  • Sun Y, Wang Y, Yang H, Suen R (2026) Adaptive template-based caching and LLM-driven summarization for richer student feedback insights. Educ Technol Soc 29(1):42–59

    Google Scholar 

  • Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In Advances in neural information processing systems, vol 27. Neural Information Processing Systems Foundation, Inc. Curran Associates, Inc, Montreal, QC, Canada

  • Veyseh A, Dernoncourt F, Dou D, Nguyen T (2020) A joint model for definition extraction with syntactic connection and semantic consistency. In: Proceedings of the AAAI conference on artificial intelligence, vol 34. pp 9098–9105. New York, NY, USA: AAAI Press

  • Xia Y, Shin S-Y, Kim J-C (2024) Cross-cultural intelligent language learning system (CILS): leveraging AI to facilitate language learning strategies in cross-cultural communication. Appl Sci 14(13):5651

    Google Scholar 

  • Xu X, Zhang Z, Wang Z, Price B, Wang Z, Shi H (2021) Rethinking text segmentation: a novel dataset and a text-specific refinement approach. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. IEEE Computer Society / Computer Vision Foundation Nashville, TN, USA, pp 12045–12055

  • Yalcin S (2014) Semantics and metasemantics in the context of generative grammar. In: Metasemantics: new essays on the foundations of meaning, vol 17. Oxford University Press New York, NY, USA, pp 17–54

  • Zhang T, Kishore V, Wu F, Weinberger KQ, Artzi Y (2019) Bertscore: evaluating text generation with Bert. In International conference on learning representations. New Orleans, LA, USA: OpenReview.net

  • Zhao J, Huang C, Li X (2024) A comparative study of cultural hallucination in large language models on culturally specific ethical questions

  • Ziemski M, Junczys-Dowmunt M, Pouliquen B (2016) The United Nations parallel corpus v1.0. In: Calzolari N, Choukri K, Declerck T, Goggi S, Grobelnik M, Maegaard B et al. (eds) Proceedings of the tenth international conference on language resources and evaluation (LREC’16). European Language Resources Association (ELRA), Portorož, Slovenia, pp 3530–3534

  • Zlibrary (2024) Zlibrary: free online digital library of books and articles. https://z-lib.fm/. Accessed Jan 2025

Download references

Author information

Authors and Affiliations

  1. Xi’an University of Technology, Xi’an, China

    Yicheng Sun, Hanbo Yang & Yi Wang

  2. Shenzhen MSU-BIT University, Shenzhen, China

    Richard Suen

Authors
  1. Yicheng Sun
    View author publications

    Search author on:PubMed Google Scholar

  2. Hanbo Yang
    View author publications

    Search author on:PubMed Google Scholar

  3. Yi Wang
    View author publications

    Search author on:PubMed Google Scholar

  4. Richard Suen
    View author publications

    Search author on:PubMed Google Scholar

Contributions

YS conceptualized the study and led the data collection process. YS and HY conducted the experiments and contributed to the data analysis. YS wrote the main manuscript text. YW and RS prepared all figures. All authors reviewed and approved the final manuscript.

Corresponding authors

Correspondence to Yicheng Sun or Richard Suen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

This study was reviewed and approved by the Institutional Review Board (IRB) of Shenzhen MSU-BIT University, which serves as the official ethics approval body for research involving human participants. Ethical approval was formally granted on 15 March 2024, under approval number 2024-03-017. All research procedures involving human participants were conducted in strict accordance with the ethical standards of the approving institution and relevant national research committees, as well as with the principles of the Declaration of Helsinki (1964) and its later amendments. The ethics review covered the use of human judgment in data quality assessment, expert evaluation of text-refinement outputs, and the authorized use of anonymized textual materials obtained from institutional repositories and licensed digital collections.

Informed consent

Informed consent was obtained from all human participants involved in this study prior to their participation. Consent procedures were conducted between April 2024 and July 2024, corresponding to the period of human judgment, expert evaluation, and validation activities reported in this paper. All participants were fully informed about the purpose of the research, the nature of the evaluation tasks, the voluntary nature of participation, and their right to withdraw at any time without penalty. Written informed consent was obtained before data collection commenced.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, Y., Yang, H., Wang, Y. et al. Bridging human judgment and AI precision: a step toward intercultural competence in text refinement. Humanit Soc Sci Commun (2026). https://doi.org/10.1057/s41599-026-06593-6

Download citation

  • Received: 13 February 2025

  • Accepted: 22 January 2026

  • Published: 03 February 2026

  • DOI: https://doi.org/10.1057/s41599-026-06593-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Associated content

Collection

Culture in code: intercultural competence in the age of AI and globalisation

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • Journal Information
  • Referee instructions
  • Editor instructions
  • Journal policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Events
  • Contact

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Humanities and Social Sciences Communications (Humanit Soc Sci Commun)

ISSN 2662-9992 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited