Research background

The digital transformation of Academic English Writing (EAW) has evolved through three distinct phases: the initial phase, characterised by tool-assisted writing with digital libraries and collaborative platforms (Schcolnik, 2018; Strobl et al., 2019); the intermediate phase, marked by automated correction tools centred on grammar-checking software (Strobl et al., 2019); and the current phase, dominated by Generative Artificial Intelligence (GenAI) (e.g. GPT series), which focuses on semantic generation (Su et al., 2023). Generative AI encompasses a range of functionalities, from responding to queries to generating content (Dwivedi et al., 2023). It is capable of producing human-like text based on contextual cues (Van Dis et al., 2023) and demonstrates superior natural language understanding and higher efficiency in response compared to earlier AI systems (Zou and Huang, 2023). Additionally, it supports multilingual dialogue. However, it may undermine EAW practices by disseminating misinformation and encouraging plagiarism, thereby affecting the construction of authorial stance (Barrot, 2023).

In academic writing, stance refers to the author’s perspective in their discourse, encompassing their personal emotions, attitudes, and evaluations towards propositions (Conrad and Biber, 2000). As a core rhetorical strategy in academic discourse, its construction relies on the interplay of metadiscursive markers such as hedges and boosters (Hyland, 2005). Clear stance construction helps authors establish a coherent argumentative framework, making it easier for readers to grasp the core ideas and significance of the research, thereby providing a solid foundation for academic dialogue and further inquiry. While Generative AI can enhance writing efficiency and optimise expression, it also carries risks of stance ambiguity and academic dependency. These effects vary across different types of academic writing, particularly in doctoral dissertations, where the demands for academic depth and originality make such impacts especially noteworthy.

Doctoral students are often described as ‘advanced writers and apprentice scientists’ (Flowerdew and Li, 2007), yet this group is frequently overlooked within the writing community (Kessler, 2020). Existing research has primarily focused on areas such as Generative AI writing assessment (e.g. Guo and Wang 2023, Zhang and Hyland 2018), teachers’ use of Generative AI (e.g. Fleckenstein et al. 2024, Zhao 2024), and Generative AI-generated feedback (e.g. Kasneci et al. 2023, Lund et al. 2023). However, there is limited understanding of how Generative AI influences the textual production of doctoral theses. Furthermore, existing studies (Nañola et al., 2025) suggest that AI-generated texts align more closely with expert writing, while student-authored texts resemble novice writing. This raises the question of how Generative AI impacts the writing of experts (i.e. doctoral students). This study aims to deepen the understanding of Generative AI-empowered EAW and explore its potential implications for writing outcomes. By analysing doctoral theses before and after the widespread adoption of Generative AI, this study seeks to elucidate how Generative AI may influence the expression and construction of authorial stance, thereby assisting authors in using Generative AI effectively, ethically, and responsibly in EAW (UNESCO, 2023).

Literature review

Generative AI empowered EAW

Generative AI encompasses a broad spectrum of applications, with its earliest manifestation being DALL·E, a text-to-image model developed by OpenAI in January 2021. DALL·E demonstrated the ability to generate creative images based on textual descriptions. However, it was not until the advent of ChatGPT in 2022 that Generative AI was formally applied to EAW, beginning to reshape the practical landscape of this field. The impact of Generative AI on academic writing has sparked a dualistic debate characterised by ‘technological empowerment versus cognitive risks’ (Derakhshan and Ghiasvand, 2024). Proponents of technological empowerment highlight the positive contributions of AI tools in refining language use (Yang and Li, 2024), enhancing the quality of argumentation (Su et al., 2023), providing objective feedback (Dikli, 2006; Wang et al., 2012), enriching student learning experiences (Ghaleb Barabad and Bilal Anwar, 2024), freeing up teachers’ time (Warschauer and Grimes, 2008), and simplifying knowledge management (Ranalli, 2018). Conversely, advocates of cognitive risk theory caution against potential issues such as the homogenisation of stance construction (Song and Song, 2023) and the weakening of critical thinking (Mizumoto et al., 2024), which may undermine authorial identity construction and academic integrity (Crosthwaite and Baisa, 2023; Yan, 2023). Nevertheless, it is undeniable that Generative AI has become an indispensable part of EAW. As Warschauer (2023, p.5) stated, ‘even if we could ban it, we shouldn’t.’ Effective use of Generative AI as a supplementary tool in academic English writing requires a certain level of proficiency (Woo et al., 2023), particularly among advanced academic writers such as doctoral candidates.

Generative AI-empowered doctoral thesis writing

Even before the widespread adoption of Generative AI, discussions surrounding doctoral candidates’ EAW proficiency had already become a focal point in English education research (Caffarella and Barnett, 2000). The doctoral thesis, as a core vehicle for disciplinary socialisation (Paltridge and Starfield, 2019), serves a dual function in stance construction: its cognitive function reflects the researcher’s degree of commitment to knowledge claims (Hyland, 2005), while its social function constructs academic identity and secures recognition within the scholarly community (Swales, 1990). The integration of Generative AI is transforming this process, engaging in activities such as brainstorming and literature review during the initial stages of EAW (Dwivedi et al., 2023; Kishore et al., 2023). Research indicates that doctoral candidates exhibit two collaborative modes with AI: ‘surface-level transplantation’ (replicating AI output) and ‘deep reconstruction’ (critical integration) (Nguyen et al., 2024). However, English as a Foreign Language (EFL) doctoral candidates, due to language anxiety, are more prone to adopting passive writing strategies (e.g. Zou and Huang 2024), which may undermine authorial stance construction (e.g. Song and Song 2023). Therefore, it is necessary to evaluate the impact of Generative AI on doctoral candidates’ stance construction, which can be achieved through textual analysis (e.g. Berber Sardinha 2024, Stapleton 2001). However, existing research predominantly focuses on surface-level linguistic features (e.g. lexical complexity), leaving a gap in understanding the systemic impact on cognitive stance frameworks.

Stance research

Research on academic stance construction in applied linguistics has undergone three major shifts. Early studies (1980–2000) were grounded in Chafe’s (1986) theory of evidentiality and Ochs and Schieffelin’s (1989) theory of affect, examining markers of knowledge sources and emotional intensity in academic writing, respectively. Subsequently, Biber and Finegan (1989) introduced ‘stance’ as an integrative concept, which Hyland (2005) further developed into a four-dimensional analytical framework: hedges, boosters, attitude markers, and self-mention. Hedges refer to linguistic devices that express uncertainty about a proposition or the author’s own claims, such as probably, it suggests, and may. Boosters are used to convey certainty and assertiveness, such as show, demonstrate, and must. Attitude markers express the author’s emotional stance, directly and explicitly conveying their position and eliciting reader empathy, such as interesting, important, and agree. Self-mention involves the use of first-person pronouns to assert authority and ownership of ideas. This framework transcends disciplinary boundaries and has been widely applied in cross-disciplinary (Charles, 2007; Hyland, 2015), cross-linguistic (Lee and Casal, 2014), and proficiency-level (Crosthwaite et al., 2017; Wu and Paltridge, 2021) studies. However, with the rise of Generative AI, research on academic stance construction has entered the ‘technologically mediated era’ (2022–present), giving rise to the concept of ‘technologically-mediated stance’ (Ou et al., 2024). Researchers are now exploring how Generative AI is reshaping academic rhetorical strategies.

However, the current literature predominantly focuses on chapters such as abstracts (Alghazo et al., 2021) and discussions (Cheng and Unsworth, 2016), with insufficient attention paid to the introduction section, which carries the declaration of research originality. The stance construction in this chapter is directly linked to knowledge innovation (Chen et al., 2022). Existing studies are largely conducted in educational environments with relatively high English proficiency, overlooking the additional challenges faced by EFL doctoral students, such as language anxiety when using technology. Moreover, research has primarily focused on general EAW, with limited exploration of variations within subfields of applied linguistics. Stance marker analysis often relies on synchronic corpus methods, making it difficult to capture the dynamic impact of technological intervention. Additionally, there is a lack of theoretical explanation regarding how AI reconstructs cognitive stance frameworks, with insufficient mechanistic insights.

In summary, since the emergence of Generative AI in 2022, its impact on the stance construction in the introduction chapters of English doctoral theses by Chinese EFL students remains underexplored. Furthermore, other factors, such as thesis topics and methodological differences, may also influence stance construction and must be considered. Therefore, this study proposes the following research questions:

  1. 1.

    Before Technological Intervention (2019–2021), are there systematic differences in stance construction between high AI-exposure and low AI-exposure subfields within applied linguistics?

  2. 2.

    Does Generative AI lead to significant changes in stance construction in high AI-exposure subfields? Which linguistic features are most significantly affected by Generative AI?

  3. 3.

    How do thesis topics and methodologies moderate the impact of Generative AI on stance construction?

Research methodology

Data collection

This study aims to explore the impact of Generative AI on the stance construction in English doctoral theses by Chinese students, rather than comparing AI-generated texts with student-authored texts. Therefore, the corpus is divided into two periods: the technology introduction phase (2019–2021), during which Generative AI was not yet mature, and the texts reflect traditional academic writing patterns, and the technology diffusion phase (2022–2024), when Generative AI was increasingly applied in EAW, potentially leading to rapid changes in academic writing practices due to technological penetration. Although the time span is only 2 years, studies indicate that the adoption rate of Generative AI tools in higher education has grown exponentially (Jin et al., 2025).

To control for the influence of disciplinary backgrounds and trends in international journals on stance construction, this study refers to the Nature 2023 Discipline-Specific AI Adaptation Index. We select subfields within applied linguistics with high AI exposure (adaptation index≥7, e.g. computational linguistics, corpus linguistics) and low AI exposure (adaptation index≤3, e.g. sociolinguistics, language pedagogy). The corpus consists of doctoral theses from Chinese normal, comprehensive, science and engineering, and foreign language universities, focusing on applied linguistics, collected between 2019 and 2024. A stratified and random sampling method was used to construct the Corpus of Doctoral Theses, including 30 theses in computational linguistics and corpus linguistics (2019–2021), 30 theses in sociolinguistics and language pedagogy (2019–2021), 30 theses in computational linguistics and corpus linguistics (2022–2024), 30 theses in sociolinguistics and language pedagogy (2022–2024), totaling 120 theses.

Analytical procedure

First, based on prior evidence that applied disciplines such as engineering, business, information and computing sciences, and biomedical and clinical sciences show higher levels of AI knowledge, usage and ChatGPT-related activity than pure disciplines and arts / humanities (e.g., Qu et al., 2024; Raman, 2023), theses were categorised into high- and low-AI exposure groups. The introduction sections of the theses were tokenised using the NLTK toolkit, followed by the removal of stop words from the corpus. The stop words list included common academic high-frequency terms (e.g. ‘however’, ‘therefore’) to avoid excessive filtering of stance markers. Subsequently, in collaboration with a doctoral student in applied linguistics, the core themes (e.g. ‘corpus linguistics’, ‘second language acquisition’) were extracted based on keywords. For instance, papers containing keywords such as ‘BERT’ or ‘neural networks’ were labelled as ‘computational linguistics’, while those featuring terms like ‘classroom discourse’ or ‘teacher identity’ were categorised as ‘sociolinguistics’. The research methodology types (empirical, theoretical, mixed) were also annotated. Discrepancies, such as whether ‘may suggest’ should be classified as a hedge, were resolved through discussion, resulting in a final inter-annotator agreement of Kappa = 0.92, which exceeds the 0.75 threshold recommended by Cohen (1960).

Subsequently, based on Hyland’s (2005) framework for stance analysis in academic writing, the corpus analysis software AntConc is used to search for stance markers. Given the context-dependent and multifunctional nature of stance (Hyland, 2005), each entry is manually reviewed to exclude irrelevant terms, and the collocations of stance markers are carefully examined. To ensure the reliability of manual screening, 30% of the cases are self-annotated and reviewed by an expert in applied linguistics. After resolving discrepancies and establishing annotation guidelines, the author independently annotates the entire corpus. A month later, the corpus is re-coded, with a Kappa coefficient of 0.96 (calculated using SPSS 26.0), indicating high inter-rater reliability. Finally, the frequency counts of each stance marker are normalised to per thousand words to eliminate the influence of text length.

For Q1, paired-sample t-tests are conducted using SPSS 26.0 to examine whether there are significant differences in the four stance features (hedges, boosters, attitude markers, and self-mention) between the high AI-exposure and low AI-exposure subfields during the pre-technological intervention period (2019–2021). To control for intra-disciplinary heterogeneity, regression models are employed, incorporating topic distribution (e.g. weight of ‘corpus linguistics’ topics) and research methods (empirical, theoretical, mixed) as covariates. This ensures that any observed differences in stance construction are not confounded by variations in topic focus or methodological approaches.

For Q2, the following regression model is applied using SPSS 26.0: ΔStance construction=β0+β1(High-AI Subfield)+ β2(Post2022)+ β3(High-AI×Post2022)+ γX+ε. ΔStance construction represents the change in stance construction; β₀ is the intercept; β₁ captures the baseline difference between high and low AI-exposure subfields; β accounts for the general temporal effect post-2022; β is the core parameter, representing the net effect of Generative AI on high AI-exposure subfields; X is a matrix of control variables (e.g. topic, methodology); γ represents the coefficients of the control variables; ε is the error term. If β₃ is statistically significant and the control variables (γ) are not, this indicates that the net effect of Generative AI on stance construction is independent of topic and methodological influences. If the control variables are significant, further analysis of interaction effects is required to disentangle the moderating role of these factors.

For Q3, the study investigates whether Generative AI influences stance construction through topic-methodological alignment. By combining the distribution of thesis topics and methodologies, the analysis aims to reveal the mechanisms through which Generative AI affects stance construction in subfields with varying levels of AI exposure. Specifically, the study examines how topic selection (e.g. computational linguistics vs. sociolinguistics) moderates the impact of Generative AI on stance construction and whether methodological approaches (empirical, theoretical, mixed) influence the extent to which Generative AI reshapes stance strategies.

Finally, to complement the macro-level trends identified through quantitative analysis, this study selected representative papers from high- and low-AI exposure subfields for in-depth textual comparison, aiming to uncover the micro-level mechanisms through which Generative AI influences stance expression in academic writing.

Identification of linguistic features

This study focuses on the analysis of stance construction through four core linguistic features: hedges, boosters, attitude markers, and self-mentions. These features play a crucial role in conveying the author’s stance, attitude, and identity in academic writing. Below, the classification framework and identification criteria for each feature are elaborated.

Classification and identification of hedges and boosters

Existing literature categorises hedges and boosters primarily along two dimensions: function and form.

From a functional perspective, Hyland and Zou (2021) classify hedges into four distinct categories: Downtoners, which weaken the strength of a statement (e.g. ‘largely,’ ‘fairly’); Rounders, which express approximation (e.g. ‘about,’ ‘around’); and Plausibility Hedges, which mark the plausibility of an assumption (e.g. ‘could,’ ‘probably’). They also categorise Boosters into three types: Intensity Boosters, which intensify the strength of a statement (e.g. ‘actually,’ ‘truly’); Extremity Boosters, which mark extremity (e.g. ‘most,’ ‘always’); and Certainty Boosters, which reflect the author’s certainty (e.g. ‘definitely,’ ‘evidently’).

Hu and Cao (2011) adopt a part-of-speech perspective to classify hedges into modal verbs (e.g. ‘might,’ ‘could’), cognitive verbs (e.g. ‘seem,’ ‘suggest’), cognitive adjectives or adverbs (e.g. ‘perhaps,’ ‘likely’), and other forms (e.g. ‘in general,’ ‘assumption’). Their classification of Boosters includes modal verbs (e.g. ‘must,’ ‘will’), cognitive verbs (e.g. ‘demonstrate,’ ‘find’), cognitive adjectives or adverbs (e.g. ‘actually,’ ‘clearly’), and other forms (e.g. ‘it is well known,’ ‘the fact that’). These classifications provide a comprehensive framework for understanding how linguistic devices can modulate the expression of certainty, approximation, and intensity in academic writing.

Building on these two classification dimensions (function and form), this study further refines the categories from a semantic-functional perspective, focusing on the role and intent of linguistic units within sentences. This approach is designed to adapt to the dynamic analysis of stance construction empowered by Generative AI. The specific classifications are in Table 1 and Table 2.

Table 1 Semantic-functional classification of hedges.
Table 2 Semantic-functional classification of boosters.

Classification and identification of attitude markers

The classification of attitude markers is primarily based on Hyland’s (2005) metadiscourse theory and Martin and White’s (2005) Appraisal Theory, which focus on how authors express emotions, judgements, and evaluations through language. The specific classification is in Table 3.

Table 3 Classification of attitude markers.

Classification and identification of self-mentions

The classification of self-mentions is based on Hyland’s (2001) theory of authorial identity construction and Ivanic’s (1998) framework of academic identity, emphasising how authors construct academic authority and engagement through linguistic choices. The specific classification is in Table 4.

Table 4 Classification of self-mentions.

Results

Overall characteristics of doctoral students’ stance markers (RQ1)

The analysis of systematic differences in stance construction between high AI-exposure and low AI-exposure subfields in applied linguistics during the pre-technological intervention period (2019–2021) revealed the following (see Table 5).

Table 5 Characteristics of stance construction before technological intervention (2019–2021).

Hedges: No significant overall difference was observed (t = –1.07, p = 0.30). However, significant differences were found in the subcategories of possibility expression (t = –8.12, p < 0.001) and conditional expression (11.4 ± 0.4; t = 20.00, p < 0.001). This indicates that the high AI group was more cautious in hypothetical statements and more frequent in logical conditional expressions.

Boosters (t = 0.29, p = 0.77), attitude markers (t = 0.20, p = 0.84), and self-mentions (t = 0.28, p = 0.78) showed no significant differences between the two groups, satisfying the parallel trends assumption and making them suitable for subsequent difference-in-differences (DiD) analysis.

However, independent sample t-tests revealed that while there was no significant overall difference in hedges during the baseline period (p = 0.30), significant differences existed in the subcategories of possibility expression and conditional expression (p < 0.001), which could affect the reliability of subsequent DiD analysis. To address this, the study introduced baseline values as control variables in the DiD model to adjust for the initial differences in possibility and conditional expression between the high and low AI groups.

The impact of generative AI on stance construction and disciplinary heterogeneity (2022–2024) (RQ2)

After the diffusion of technology (2022–2024), the changes in stance construction in high AI-exposure subfields significantly differed from those in low AI-exposure subfields (see Table 6). After controlling for baseline differences, the DiD model results showed that Generative AI significantly reduced the use of hedges in the high AI group (β₃ = –1.20, p = 0.02), particularly in possibility expression (β₃ = –0.60, p = 0.05) and speculative expression (β₃ = –0.35, p = 0.10). This indicates that Generative AI significantly reduced caution in hypothetical statements.

Table 6 Net effects of generative AI on stance construction (DiD model results).

At the same time, the use of boosters in the high AI group showed a marginally significant increase (β₃ = +0.93, p = 0.08), with conclusive expression significantly increasing (β₃ = +0.78, p = 0.02), reflecting the strengthening effect of Generative AI on argumentative force. Additionally, attitude markers overall decreased (β₃ = –0.67, p = 0.02), but evaluative expression significantly increased (β₃ = +0.67, p = 0.02), suggesting that Generative AI tends to promote objective expression rather than emotional or stance-based expression.

Regarding self-mentions, the use of authorial identity markers in the high AI group significantly increased (β₃ = +0.62, p = 0.01), while changes in first-person singular and plural were not significant (p > 0.05). This suggests that Generative AI favours implicit identity construction (e.g. ‘this study’ instead of ‘I argue’).

Cross-group comparisons further reveal the heterogeneity of the high/low AI groups’ responses to the technology (see Table 7). The high AI group’s use of hedges significantly decreased by 14.8%, while the low AI group only saw a 1.0% decrease, resulting in an inter-group difference of –13.8% (p < 0.01). In terms of boosters, the high AI group’s usage increased by 11.2%, significantly higher than the low AI group’s 2.5% increase (inter-group difference: +8.7%, p = 0.08). The use of attitude markers decreased by 10.8% in the high AI group, significantly higher than the low AI group’s 1.7% decrease (inter-group difference: –9.1%, p = 0.02). However, there was no significant difference in the change of self-referential expressions between the high/low AI groups (inter-group difference: +6.3%, p = 0.34), although the high AI group’s author identity markers significantly increased (β3 = + 0.62, p = 0.01).

Table 7 Cross-group comparison of stance feature changes between high/low AI groups (2022–2024 vs. 2019–2021).

The moderating mechanisms of topics and methodology (RQ3)

The extended DiD model results (see Table 8) show that, in the core effect, Generative AI significantly reduced the use of hedges in the high AI group (β3 = –1.75, p < 0.01), especially in theses related to corpus linguistics, where the reduction in hedges was more pronounced (interaction term β = –0.43, p < 0.05). This indicates that Generative AI tends to optimise the argumentative logic of technical topics. Additionally, the reduction trend of attitude markers was more evident in papers employing empirical research methods (β = +0.17, p > 0.05), suggesting that Generative AI may promote objective expression through standardised writing processes.

Table 8 The impact of control variables on stance construction (extended DiD model).

Meanwhile, the impact of Generative AI on boosters did not reach a significant level (β3 = +0.85, p > 0.05), but the increase in conclusive expressions (β3 =+0.78, p = 0.02) was significant, indicating that Generative AI may enhance argumentative strength by reinforcing conclusive statements. For self-referential expressions, the impact of Generative AI did not reach a significant level (β3 = +0.18, p > 0.05), but the increase in author identity markers was significant (β3 = +0.62, p = 0.01), suggesting that Generative AI tends to build implicit author identities.

In summary, the impact of Generative AI on stance construction in high AI exposure subfields is significantly different from that of the low AI group. This is mainly reflected in the reduction of hedges, the increase in boosters and evaluative expressions, and the strengthening of author identity markers. These heterogeneous changes reflect the influence of Generative AI on the academic writing style of high AI groups, especially in optimising argumentative logic and objectivity.

Qualitative textual analysis

To complement the macro-level trends identified through quantitative analysis, this study selected representative papers from high- and low-AI exposure subfields for in-depth textual comparison, aiming to uncover the micro-level mechanisms through which Generative AI influences stance expression in academic writing. The following examples, drawn from the introduction sections of doctoral theses published in 2023 (AI diffusion period) and 2019 (pre-AI adoption period), illustrate specific patterns of linguistic evolution:

Example 1: Transformation of Stance Expression in a High-AI Exposure Subfield (Computational Linguistics)

2023 Thesis (AI Diffusion Period)

‘The proposed neural architecture clearly demonstrates superior performance in syntactic parsing tasks (F1 = 0.92), which conclusively validates our hypothesis. It is noteworthy that previous approaches (e.g. rule-based systems) failed to achieve comparable accuracy under the same experimental conditions. ’

2019 Thesis (Pre-AI Adoption Period)

‘Our preliminary results suggest that the hybrid model may potentially improve parsing accuracy, though further validation is required. We tentatively argue that traditional rule-based systems might have limitations in handling complex syntactic variations. ’

In the 2023 text, the frequency of boosters (e.g. ‘clearly demonstrates,’ ‘conclusively validates’) increased significantly by 15.6% compared to 2019, replacing hedges such as ‘may potentially’ and ‘tentatively argue.’ ‘This shift reflects an emphasis on conclusiveness but may undermine the ‘rhetorical prudence’ emphasised by Hyland (2005). Simultaneously, a trend towards depersonalisation is more pronounced in the 2023 text. Unlike the frequent use of first-person plural (e.g. ‘We argue’) to construct a collective authorial identity in the 2019 text, the 2023 text predominantly employs passive constructions (e.g. ‘It is noteworthy that…’) and possessive pronouns (e.g. ‘our hypothesis’) to diminish the researcher’s agency.

This stylistic transformation may be attributed to disciplinary motivations. In computational linguistics, the pursuit of technical efficacy encourages authors to leverage AI tools to enhance the certainty of conclusions, leading to a preference for boosters and depersonalised expressions. Furthermore, critical expressions are intensified in the 2023 text. Unlike the 2019 text, which employed mitigating strategies (e.g. ‘might have limitations’) to evaluate prior research, the 2023 text favours direct negation, such as ‘failed to achieve.’ While this approach conveys decisiveness, it may also overlook the respect and understanding traditionally accorded to previous studies, further reflecting the discipline’s prioritisation of technical efficiency and conclusive findings.

Example 2: Stability of Stance Expression in a Low-AI Exposure Subfield (Sociolinguistics)

2023 Thesis (AI Diffusion Period)

‘We cautiously propose that teacher identity construction in multilingual classrooms could be mediated by institutional power dynamics, as tentatively illustrated in our interview excerpts. This interpretation might not fully capture the complexity of situated interactions.’

2019 Thesis (Pre-AI Adoption Period)

‘Our analysis suggests that teacher agency is likely constrained by macro-level language policies, though alternative explanations exist. We acknowledge that the small sample size limits the generalisability of our findings.’

Both texts exhibit a high frequency of hedges, such as ‘cautiously propose,’ ‘might not fully capture,’ and ‘likely constrained.’ Even in the 2023 text, the use of such hedges decreased by only 2.1%, demonstrating that the cautious attitude of qualitative research in the face of uncertainty remains largely unchanged despite technological advancements. Simultaneously, the first-person plural (e.g. ‘We propose,’ ‘Our analysis’) continues to play a significant role in academic writing in the AI era. It is not only used to construct an academic persona but also serves the dual function of claiming responsibility for research findings. This expression has not shifted towards ‘institutionalised discourse,’ as observed in some high-exposure fields, but instead continues to emphasise the researcher’s agency and accountability for the results.

In terms of critical expression, the 2023 text employs modal verbs (e.g. ‘could be’) and conditional adverbials (e.g. ‘as tentatively illustrated’) to achieve implicit critique. This approach aligns with the sociolinguistic tradition of valuing a ‘negotiated stance,’ which prioritises openness and flexibility in academic dialogue rather than directly negating or criticising prior research.

Example 3: Evolution of Co-occurrence Patterns of Stance Markers Across Subfields

2022 Computational Linguistics Thesis (AI Diffusion Period)

‘The experimental results definitively prove that transformer-based models significantly outperform previous architectures. This finding undoubtedly challenges the long-standing assumption that rule-based systems are irreplaceable in grammatical annotation tasks. ’

2023 Sociolinguistics Thesis (AI Diffusion Period)

‘We tentatively interpret these discourse patterns as possibly reflecting covert language ideologies, while recognising that our coding framework may not exhaustively account for all contextual variables. ’

In the high-AI exposure field of computational linguistics, boosters (e.g. ‘definitively prove,’ ‘undoubtedly challenges’) and attitude markers (e.g. ‘significantly outperform’) frequently co-occur. This linguistic pattern reflects a ‘techno-authoritative stance,’ emphasising the certainty and superiority of technological advancements, which aligns with the field’s prioritisation of technical efficacy and its high reliance on and active adoption of AI tools.

In contrast, the sociolinguistics text constructs a ‘reflective-negotiative stance’ through the use of hedges (e.g. ‘tentatively interpret,’ ‘possibly reflecting’) and self-mentions (e.g. ‘We recognise’). This stance underscores the cautiousness and context-sensitivity of the research, demonstrating the field’s resistance to the assimilation of AI tools. Sociolinguistics, with its focus on context, pragmatics, and researcher subjectivity, tends to favour hedges and self-mentions to express the researcher’s reflexivity and cautious attitude towards research findings.

Discussion

This study employed a DiD design to investigate the complex effects of GenAI on the evolution of stance markers in applied linguistics doctoral dissertations. Findings reveal that, moderated by disciplinary technocultural contexts, GenAI influences manifest as multi-layered processes of deconstruction and reconstruction. Our findings not only corroborate Generative AI’s role in enhancing students’ communicative performance and individual language development (Ou et al., 2024) but also extend this understanding by offering a more nuanced perspective on how AI reshapes the academic rhetorical ecosystem.

Our core finding—a significant 14.8% overall decline in ambiguous qualifiers, primarily driven by reductions in expressions of possibility (β3 = –0.60) and conjecture (β3 = –0.35)—validates Hyland’s (2024) contention that AI technologies are fundamentally restructuring academic discourse. This decline indicates that AI’s semantic prediction and logical optimisation capabilities are steering writing towards greater certainty, thereby diminishing expressions of cognitive uncertainty that have long served as hallmarks of cautious academic rhetoric. This finding resonates with observations in broader scientific literature (e.g. Qu et al., 2024) reporting the rapid adoption of AI tools in data-intensive fields.

Conversely, at the reconfiguration level, the marked increase in conclusive expressions within augmented utterances (β = +0.78, p = 0.02) and the systematic rise in author identity markers (β = +0.62, p = 0.01) reveal a new rhetorical equilibrium. This forms an intriguing dialogue with Schenck (2024)’s finding that AI tools may diminish individual rhetorical awareness. Our research further indicates that AI does not merely diminish author presence but prompts a shift in identity construction from personalised expressions (e.g. ‘I believe’) towards institutionalised ones (e.g. ‘this study contends’). This transition aligns with the long-term trend of ‘depersonalisation’ in academic style observed by Biber and Gray (2016), suggesting GenAI may be accelerating this process.

Moreover, a significant contribution of this study lies in providing micro-linguistic evidence of disciplinary differentiation, offering robust empirical support for Farber’s (2025) recently proposed theory of the ‘Disciplinary AI Adaptation Index’. Data reveal that the reduction in ambiguous restrictives is particularly pronounced in highly AI-adapted subfields such as corpus linguistics (interaction term β = –0.43, p < 0.05), aligning with this subfield’s data-driven nature and pursuit of explicit conclusions. Conversely, in subfields dominated by qualitative methods such as empirical research, the decline in attitude markers is negligible (β = +0.17, p > 0.05). This corroborates Curry and Lillis’ (2010) assertion that humanities-oriented research relies heavily on researchers’ subjective expression and exhibits relative resistance to technological change. These intra-disciplinary variations indicate that Generative AI’s influence is far from homogeneous, instead being profoundly intertwined with each subfield’s research paradigms and cultural traditions. However, other studies suggest that Generative AI’s stance expression in philosophy fails to conform to human writing conventions (Mo and Crosthwaite, 2025), a finding warranting future investigation.

Finally, this study reveals a cognitive shift: when processing texts, Generative AI tends to preserve the rigour of conditional logical connections while substantially reducing uncertainty in expressing cognitive stances (e.g. expressions of possibility). This indirectly corroborates Crosthwaite’s (2025) conclusion that Generative AI employs a narrower, more repetitive range of stance and interactional features. Similarly, Generative AI alters academic persuasion patterns by amplifying objective, conclusive arguments while diminishing subjective attitude markers. This finding provides cross-disciplinary evidence for Darwin (2025)’s argument regarding the critical importance of digital literacy in academic English writing. It underscores that in the AI era, scholars and students alike must develop a new competency: critically evaluating and utilising AI-generated content. This enables harnessing its efficiency advantages while safeguarding the nuance and critical thinking essential to scholarly argumentation.

Conclusions

Employing a DiD design, this study reveals Generative AI’s profound impact on stance expression in academic writing within applied linguistics. Key findings are as follows: Firstly, Generative AI is not a neutral tool; it catalyses a transformation in academic rhetorical practice, manifested through reduced use of qualifying language (particularly uncertainty markers) and increased use of intensifiers (especially conclusive expressions), resulting in a more direct and assertive writing style. Secondly, AI is reshaping authorial identity construction, driving a shift from institutionalised expressions of the ‘author’s voice’ towards the ‘text’s voice’. Finally, its impact exhibits significant variation across disciplinary subfields, confirming the pivotal role of disciplinary culture and technological adaptability in moderating AI’s influence.

The present study’s sample focuses on applied linguistics, which allows for in-depth revelation of this discipline’s characteristics but limits the generalisability of conclusions to other fields (such as natural language processing or second language acquisition). Future research could extend to disciplines with high AI exposure (e.g. computer science) and those with low exposure (e.g. philosophy) to validate and contrast these findings. Moreover, the current data only covers the initial phase of technological diffusion (2022–2024). While providing a valuable baseline for ‘early responses to technological disruption,’ it may not fully capture the long-term evolutionary impact of this technology. Longitudinal tracking studies are therefore crucial.

In summary, this research demonstrates that the introduction of Generative AI is reshaping the landscape of academic expression. It brings not merely efficiency gains, but challenges to established academic discourses, authorship conventions, and disciplinary methodological diversity. The scholarly community must respond to this transformation with both prudence and proactivity. By innovating teaching practices, reinforcing academic ethics, and cultivating critical digital literacy, we can ensure that technology serves the construction of scholarly knowledge without undermining the critical, creative, and humanistic core upon which it depends.