A corpus-based study of passive voice trajectories in methods sections across three academic branches (1980–2020)

Le, Rurong; Yu, Sheng; Hao, Mingxing

doi:10.1057/s41599-025-06007-z

Download PDF

Article
Open access
Published: 21 November 2025

A corpus-based study of passive voice trajectories in methods sections across three academic branches (1980–2020)

Rurong Le¹,
Sheng Yu² &
Mingxing Hao²

Humanities and Social Sciences Communications volume 12, Article number: 1805 (2025) Cite this article

2165 Accesses
Metrics details

Subjects

Abstract

Previous studies have noted a general decline in the use of the passive voice (PV) in academic research articles across disciplines. Few, however, have focused specifically on the PV-laden Methods or Methodology sections, particularly across the three main academic branches, let alone identified which functional categories of PV contribute to this trend. This study therefore investigates the evolution of PV use in academic writing across the three branches—humanities, social sciences, and natural sciences—from 1980 to 2020. We analyzed the PV in the Method(s)/methodology sections of 450 SSCI/SCI-indexed research articles using Spearman correlations and multivariate regression analysis. We also functionally classified PV into five categories: procedural, reporting, relational, causative, and evaluative. Results revealed significant disciplinary divergence: social sciences exhibited the steepest PV decline, driven by reductions in procedural and causative passives, while humanities showed strong but non-significant decreases, primarily linked to causative passives. Natural sciences demonstrated post-2000 stabilization, with relational passives balancing stylistic shifts and methodological needs. These findings underscored discipline-specific recalibrations: humanities and social sciences aligned with Plain English norms favoring active voice, while natural sciences retained relational passives for cross-study comparability. Accordingly, we proposed three strategies: (1) adhering to journal-specific style guides, (2) implementing discipline-sensitive pedagogy, (3) using authentic case studies to illustrate PV/AV trade-offs.

A discourse-historical analysis of philanthropic legitimacy construction in rural China

Article Open access 04 April 2026

Public values in public R&D through natural language processing

Article Open access 27 October 2025

Long-term effects of a double hit murine model for schizophrenia on parvalbumin expressing cells and plasticity-related molecules in the thalamic reticular nucleus and the habenula

Article Open access 24 October 2024

Introduction

The passive voice (PV) has been a staple of academic writing for years, especially in scientific discourse, where it has long been favoured for its air of objectivity and impersonality (Cooray, 1967; Banks, 2017), and sometimes, for cohesion (Baratta, 2009; Leong, 2020). Recent decades, however, have seen a significant shift in writing conventions, with a growing preference for the active voice (AV).

One of the major social-cultural reasons for this evolving landscape is the current trend of a reaction against the more detached, abstract, and alienating style characteristic of academic writing (Farrelly & Seoane, 2012; Halliday & Martin, 1993; Seoane, 2013), exemplified by the Plain English Campaign. Thus, style guides (e.g. DeRespinis et al., 2012; American Psychological Association APA (2020); Dastjerdi et al., 2021; Fauziah & Bashtomi, 2024) and writing textbooks (e.g. Perelman et al., 1998; Schultz, 2009; Schuster et al., 2014) have increasingly advocated for the AV, some even recommending that writers expunge all uses of the PV. A second key factor is the intense competition within the scientific community. In the era of the ‘informational explosion’ (Biber & Clark, 2002: 63–64), the increasing pressure to communicate efficiently and clearly has forced writers to adapt, prioritizing clarity, brevity, and reader engagement (Penrose & Katz, 2010). Indeed, research has shown that texts in the PV increase readers’ perceived temporal, hypothetical, spatial (Praminatih et al., 2018), and psychological distance (Chan & Maglio, 2020; Sepehri et al., 2023) from activities described in the text. For the sake of reader engagement, even grammar checkers such as Grammarly (Abu Qub’a et al., 2024) now flag passive constructions, recommending active alternatives.

This trend is not merely anecdotal; corpus-based studies have documented a general decline in passive constructions across various registers, including academic prose (Mair & Leech, 2006; Leech et al., 2009; Banks, 2017; Wheeler et al., 2021; Li, 2022), though some of these (e.g. Mair & Leech, 2006; Leech et al., 2009; Hyland & Jiang, 2017) only include the academic passive as a small component of their syntactic change. Indeed, corpus researchers such as Seoane and her collaborators (Seoane & Loureiro-Porto, 2005; Seoane, 2006; Seoane & Williams, 2006) began to give special attention to the PV, recording a marked decrease in passive usage in scientific texts. More recently, Wheeler et al. (2021), in a corpus-based study of academic psychology abstracts from 1970 to 2016, found a dramatic 92% surge in the use of personal pronouns, providing strong evidence of a shift away from the traditional PV dominance in academic writing. Using a corpus of 2,707 research article abstracts in applied linguistics from 1990 to 2019, Li (2022) found a declining trend in PV usage, especially in co-authored ones. These findings suggest a shift towards a more informative, efficient, and reader-friendly academic writing style—a broader movement from the traditional image of distance and impersonality, toward a more visible authorial presence, often signaled by the use of self-promotional first-person pronouns (Hyland, 2001, 2002; Harwood, 2005). Despite these insights, several critical limitations remain in the literature.

First, many studies have limited sample sizes, which constrain the generalizability of their findings. For example, Tarone et al. (1998) research, while trailblazing, only analyzed two research articles published in The Astrophysical Journal. Banks’ (2017) analysis of multi-authored research articles, while insightful, was based on only 32 articles, raising questions about its broader applicability. Dumin (2010) sampled 15 articles from The American Journal of Botany with 5 for each of three time periods 1914–1918, 1962–1966, and 2004–2008.

Second, a significant portion of the multi-disciplinary research on PV is synchronic, for example, Leong’s (2014) examination on the PV in 60 research articles from six science journal, Hundt et al., 2021 focus on the choice of PV across different varieties of academic Englishes, and Seoane & Hundt’s (2017) association of PV with authorial presence across disciplinary areas. While these synchronic, multi-disciplinary investigations provide valuable insights, they fail to capture the dynamic evolution of writing conventions over time.

Third, even among diachronic studies, most focus on a single discipline or a narrow range of fields (Leong, 2020; Wheeler et al., 2021; Li, 2022), except Hyland & Jiang’s (2017) diachronic, multi-disciplinary research on ten linguistic features commonly associated with informality.

Fourth, the majority of existing medium-to-large-sized corpus research has focused on full-length articles (e.g. Banks, 2017; Hyland & Jiang, 2017; Millán, 2010; Millar et al., 2013; Leong, 2020) or used a wide spectrum of texts (Hundt et al., 2016; Hundt et al., 2021), with only a handful of studies, such as Li’s (2022), examining a particular section of research articles. This is particularly problematic given that the Method(s)/Methodology section remains understudied. The gap is significant. For one thing, this section has been shown to contain a higher frequency of passive constructions (Millar et al., 2013; Leong, 2014; Banks, 2017; Fauziah & Bashtomi, 2024), prevalent across sentence, clause, and phrase levels (Djuwari et al., 2022). For the other, the Method(s)/Methodology section is often where the passive voice is most functionally appropriate, emphasizing processes over agents.

Moreover, while prior research has provided theoretical (Quirk et al., 1985) and qualitative function-based (Lim, 2017) categorizations of passives, no study to date has quantitatively investigated which functional subtypes of passive constructions significantly drive changes in PV usage over time. Thus, a functionally categorized, corpus-driven quantitative approach may provide statistical evidence to identify which passive types significantly drive PV trends, thereby offering empirically grounded guidelines for academic writing instruction.

Taken together, the existing literature has captured a trend toward a more personal, informal academic writing style, though it has failed to address the overall landscape of passive/active use and its dominant passive subtypes in high-quality academic writing, particularly the Methods/Methodology section. Building on this gap, this corpus-based study undertakes a diachronic analysis of PV usage in the Method(s)/Methodology sections of SSCI/SCI-indexed journal articles spanning the natural sciences, the social sciences, and the humanities—the three major branches of modern academia (Becher & Trowler, 2001).

More specifically, this study focuses on five benchmark years—1980, 1990, 2000, 2010, and 2020—spanning four decades for two main reasons. First, decade-long intervals are a well-established convention in corpus linguistics, striking a balance between capturing long-term trends and preserving analytical clarity (McEnery & Wilson, 2001). Second, beginning in 1980—immediately after the Plain English Campaign’s institutional launch in 1979—provides a clear temporal marker for detecting any subsequent stylistic shifts. This initial post-Campaign period also saw its influence in countries such as New Zealand (Campbell, 1999), the US (The Securities and Exchange Commission, 1998), Canada (Supply and Services Canada, 1994), and within international bodies like the UN (1984) and the European Commission (Grasso, 2018). In academic contexts, some institutional style guides (e.g., University of Exeter, n.d.; BetterEvaluation, 2012) explicitly adopt the Campaign’s definition of plain English—clear, concise, and reader-focused language. As these principles gained institutional endorsement, academic style manuals increasingly promoted features such as active voice and first-person pronouns while discouraging passive constructions. Over time, such stylistic recommendations began to permeate scholarly writing, contributing to a broader shift away from impersonal, passive formulations (Kimble, 2012; Cutts, 2020).

Overall, by examining the Methods/Methodology section of SSCI/SCI-indexed journals, this study aims to provide a diachronic, cross-disciplinary perspective on how PV usage has evolved in academic writing over the past four decades. This study aims to answer the following research questions (RQs).

RQ 1: What are the trends of PV in high-quality academic Methods/ Methodology sections in the natural sciences, the social sciences, and the humanities since 1980?

RQ 2: If there are changes in voice preference, are they statistically significant in the three academic branches?

RQ 3: Which type(s) of passive contribute(s) to PV change in the three academic disciplines?

Methodology

This section includes four parts. The first part is the process of corpus compilation and the second, our coding of passive structures in the corpora. The last two parts describe how we classified the passive structures and the statistical tests we conducted.

The corpora

This part details how we compiled our corpora. The process includes the data source and our selection criteria. We also report the difficulties of data selection and the compromises we made.

Corpus design and text selection criteria

Our corpora were compiled from the Method(s)/Methodology section of research articles selected from international peer-reviewed journals that meet the following requirements.

First, the journals were selected based on impact factor rankings in Web of Science citation reports, prioritizing high-quality publications that make significant contributions to academic discourse. We also considered the diversity of publishers to include a range of writing style guidelines.

Second, journals needed sufficient historical presence to support our diachronic analysis. We sampled research articles at decade intervals from 1980 to 2020 (1980, 1990, 2000, 2010, and 2020). The reason for choosing 1980 as the starting year was to examine the influence of the Plain English Campaign in the late 1970s, which advocated the AV over the passive. As a result, all selected journals were established before 1980 (see Appendix A for the complete journal list).

Third, each research article must contain an independent section for methods or methodology. We focus on the Method(s)/Methodology section for two reasons. First, it receives relatively less attention than such sections as the abstract, introduction, and discussion. Second, it can be seen as ideal for studying passive structures because it is the venue where researchers highlight the instruments, processes, samples, and subjects, but the self-clear or less important agent(s) may be obscured or omitted.

Applying the above three criteria, we sampled the research articles starting from the first paper of the first issue of that year. We collected five research articles per journal in each period, resulting in 450 research articles whose combined Method(s)/Methodology sections comprised 528,830 tokens in 25,989 sentences.

During the sampling, we made several compromises to collect the data we needed. First, as many early research articles, especially those published in 1980 and 1990, lack a clear or independent Method(s)/Methodology section, we resorted to the 1981/1991 and 1982/1992 issues when those from 1980/1990 did not provide us with sufficient research articles meeting our criteria. In these early research articles, an independent section titled “The data” was also considered as the Method(s)/Methodology section since we found that their description of the data accounted for an important part of the Method(s)/Methodology section. Second, we selected 7–8 research articles from each history journal as compensation for the number of history research articles because we succeeded in accessing only two history journals—instead of three, as we did in other disciplines—whose research articles describe the methodology. Third, we included some not-so-top journals because the top ones are too young to be included in this research.

Corpus compilation

Most articles were downloaded in PDF format and converted to editable Word documents. From the Word documents, we copied the Method(s)/Methodology section to three new documents named “HU” (for the humanities), NS (for the natural sciences), and SS (for the social sciences) for the five years selected. The Method(s)/Methodology sections of open-access research articles, however, were directly copied from their accessible webpages.

However, not all the parts in the Method(s)/Methodology sections were put into the corpora. For some the humanities and social science research articles, we deleted whole-paragraph citations and the narration from the subjects because they were not the authors’ own words.

After completing all the work, we converted the Word documents into plain-text files compatible with corpus analysis tools. Following this, AntConc 4.2.0 (Anthony, 2022) was utilized to quantify PV across the corpus. As summarized in Table 1, the final dataset comprised 15 disciplinary-temporal sub-corpora.

Table 1 The volume of each sub-corpus (number of tokens/number of sentences).

Full size table

Extracting PV structures

Subsequent to corpus compilation, the passives were manually annotated, and then coded mainly through TreeTagger for part-of-speech tagging (Schmid, 1994) and AntConc 4.2.0 for PV extraction (Anthony, 2022). Following standard syntactic definitions (Biber et al., 1999; Quirk et al., 1985), we operationalized PV as be-passive constructions—clauses where the grammatical subject receives the action expressed by a verb phrase combining a form of be (am/are/is/was/were/be/been/being) with a past participle. Two categories were therefore excluded: (1) get-passives (e.g., the sample got tested), which are statistically marginal in academic writing (Biber et al., 1999: 926); (2) passive post-modifiers, as they function adjectivally rather than as core predicates.

Manually identifying all sub-corpora

Through manual annotation, we aimed to identify all possible word combinations that contribute to be-passives. Details are shown in Table 2.

Table 2 Structures of be-passives.

Full size table

Table 2 shows all the possible structures that formed passives in this research, with examples cited from the corpora. The first one is the basic structure of passives, and all the others are variations where some words are inserted between the copular be and the participle -ed. With these structures, we hope to exhaust the passives needed for the research.

Extracting passives with corpus tools

Extracting the passives depends on the tagging of parts of speech (POS) of all the words in the corpora. TreeTagger accomplished this task automatically. After all the POS were tagged, we used Regex (regular expressions) to extract the passives.

According to the structures listed in Table 2, we used the following regular expressions in Antconc (Table 3).

Table 3 The regular expressions for passives in this research.

Full size table

Table 3 shows all the regular expressions used to extract the passive structures. Each regular expression in the first column was combined with each in the second and with the one in the last. In this way, we extracted different passive structures. For instance, an entry of the combination [A-z]+_VBD [A-z]+_RB [A-z]+_VVN, yielded the following results (see Fig. 1).

However, the above regular expressions would result in some unwanted outcomes. When [A-z]+_DT was inserted in between, we could get “As this is an ordered outcome,…” (SS-2010). Although the word ordered was tagged as a past participle, it was used as a pre-modifier and was, therefore, deleted. When [A-z]+_VBG in the first column was used, we got some overlapped results, which we also deleted.

Manual validation

While corpus tools provided precise extraction of passive constructions, manual validation was necessary to exclude ambiguous cases, particularly formulaic prepositional passives (Schwarz, 2019). To ensure objectivity, both authors independently reviewed all identified instances, achieving near-perfect inter-rater reliability (Cohen’s κ = 0.94). Discrepancies in classification (e.g., whether be composed of denotes an action or a state) were resolved through iterative discussions, supplemented by consultations with dictionaries and native English speakers.

This rigorous process led to the exclusion of seven high-frequency prepositional passives: be based on, be concerned with, be related to, be composed of, be correlated with, be associated with, and be made up of. Two criteria justified their removal:

1.
Functional mismatch: These constructions predominantly describe inherent states or material relationships (e.g., the sample was composed of) rather than deliberate methodological actions by authors (e.g., the data were analyzed).
2.
Formulaic bias: Their conventionalized usage in academic writing (e.g., the framework is based on) risks artificially inflating passive voice frequencies, thereby distorting diachronic comparisons between PV trends.

Classifying passive verbs

The above steps ensured that we obtained the intended PV data, which we then categorized into five types—procedural, reporting, relational, evaluative, and causative verbs—based primarily on Systemic Functional Grammar (SFG) (Halliday, 1994) and on SFG-informed local grammar (Hunston & Sinclair, 2000). Procedural verbs, aligning with material processes (Halliday & Matthiessen, 2014), describe processes, methods, or actions taken to carry out the research (e.g. select, collect, transcribe). Reporting verbs reference or describe previous studies or findings (Hyland, 1999) (e.g. report, indicate). Relational verbs describe relationships, comparisons, or connections between research elements (Halliday & Matthiessen, 2014) (e.g. compare, involve, represent, match). Evaluative verbs express evaluation, judgment, or justification (Hunston & Sinclair, 2000) (e.g. assess, evaluate, regard). Causative verbs express causation, enablement, or influence (Halliday & Matthiessen, 2014) (e.g. allow, require, affect).

To ensure the robustness of our categorization, we, two linguistically trained coders, independently annotated all instances, achieving a Cohen’s κ of 0.84—a reliability coefficient indicating excellent agreement beyond chance (McHugh, 2012). Discrepancies, primarily arising from ambiguous functional overlaps (e.g., causative-evaluative or procedural-causative boundaries), were resolved through iterative consensus discussions anchored in SFG principles, followed by third-party adjudication for persistent edge cases.

Statistical tests

After data extraction, we normalized the frequency of PV constructions as the number of occurrences per 1000 sentences (PV frequency = [PV count / total sentences] × 1000). These frequencies were computed separately for each period (1980, 1990, 2000, 2010, 2020), academic branch (the natural sciences, the social sciences, the humanities), and passive types (procedural, reporting, relational, evaluative, and causative). The normalized data were subsequently used for Spearman correlation tests and Multivariate regression analysis. All statistical analyses were conducted using IBM SPSS Statistics (Version 26). Significance levels were set at p < 0.05 for all tests. Prior to analysis, we verified that our data met the necessary assumptions for each statistical test, including independence of observations and absence of multicollinearity for the regression analysis.

Spearman correlation tests

To quantify trajectories of PV usage over time, we employed Spearman’s rank correlation coefficient, a non-parametric test robust to non-normality and outliers (Schober et al., 2018). This approach aligns with the ordinal nature of time periods and accommodates potential non-linear trends in linguistic change (Gries, 2021). For each academic branch, we correlated passive structure frequency (normalized per 1000 sentences) with time to identify branch-specific trends.

Pearson correlation tests and multivariate regression analysis

To identify which particular type(s) of passives significantly contributed to these diachronic trends, we conducted a staged analytical approach: 1) Pearson correlation analysis to examine bivariate relationships among the different passive structures, and 2) multivariate regression models were built to isolate the independent effects of each passive type while controlling for inter-correlations. This approach aligns with established practices in diachronic corpus linguistics (Hilpert & Gries, 2016).

In this model, time (represented by the five sampling points: 1980, 1990, 2000, 2010, and 2020) served as the independent variable, while the normalized frequencies of the five passive types were the dependent variables. By implementing the multivariate framework—a recommended procedure in quantitative linguistics for controlling inter-variable interactions (Levshina, 2015)—we obtained regression coefficients that quantified the rate and direction of change for each passive type. These metrics thereby enabled us to determine which types were most responsive to changing writing conventions across the decades.

Results

The diachronic development of passives

The normalized frequencies of the passives and the results of Spearman correlation tests are presented in Table 4.

Table 4 Normalized frequencies and the Spearman test results.

Full size table

Table 4 reveals an overall decline in the normalized frequencies of PV constructions across three major academic branches over a forty-year period from 1980 to 2020. The table also includes Spearman correlation coefficients that quantify the strength and direction of these temporal trends.

Specifically, in the humanities, PV frequency exhibited a considerable downward trajectory, decreasing from 468.52 in 1980 to 363.69 in 2010, followed by a modest recovery to 383.86 in 2020. The natural sciences demonstrated the highest initial frequency of PV usage (666.67 in 1980), aligning with the traditional scientific writing convention of emphasizing objectivity through passive constructions (Cooray, 1967). While this frequency tumbled to 503.08 by 2000, it subsequently stabilized and showed slight increases in 2010 (523.23) and 2020 (530.26). The social sciences exhibited the most pronounced and statistically robust decline in PV usage over the studied period. The frequency decreased steadily and substantially all the way from 568.90 in 1980 to 347.45 in 2020.

The aforementioned decline trends were further substantiated by Spearman’s correlation test. For the social sciences, the test result indicated an exceptionally strong negative correlation (r_s = −0.975) between Year and Frequency. This relationship was statistically significant at the 0.05 level (p = 0.005), demonstrating a robust inverse relationship. The correlation coefficient’s proximity to −1.0 suggested that as years progress, there is a highly consistent decrease in PV frequency. In terms of the humanities, the analysis revealed a strong negative correlation (r_s = −0.872) between Year and Frequency. This relationship approached but failed to reach conventional statistical significance (p = 0.054), falling just short of the standard 0.05 threshold. The correlation coefficient suggests a substantial inverse relationship, indicating that frequency tends to decrease as years advance, though with slightly less consistency than observed in the social sciences. Meanwhile, the result for the natural sciences displayed a moderate negative correlation (r_s = −0.462) between Year and Frequency. This relationship is not statistically significant (p = 0.434), indicating that the observed negative association could potentially be attributed to chance. The correlation coefficient suggested a weaker temporal pattern compared to the humanities and the social sciences, demonstrating that year progression has a less consistent relationship with frequency changes in the natural sciences.

The contributing passive categories to the diachronic trends

Humanities

Table 5 shows the Pearson correlation between each type of passive and the total for the humanities.

Table 5 The Pearson correlation coefficients in the humanities.

Full size table

Notably, there was an exceptionally strong correlation between causative passives and the total (r = 0.993, p < 0.001), indicating that causative passives were highly representative of overall linguistic patterns in the humanities. Similarly, reporting passives (r = 0.918, p = 0.014) and relational passives (r = 0.905, p = 0.017) demonstrated robust positive correlations with statistical significance. While evaluative passives showed a moderate positive correlation (r = 0.724), this relationship did not reach statistical significance (p = 0.083). Procedural passives exhibited the weakest correlation (r = 0.052, p = 0.467), suggesting these structures were used independently of overall linguistic density in the humanities. A further multivariate regression analysis (Table 6) helped determine which type(s) of passives significantly contributed to the total.

Table 6 Summary of multivariate regression analysis for the humanities.

Full size table

In Table 6, the regression analysis yielded a single-predictor model with causative as the only significant predictor of the overall diachronic trend. The overall model was statistically significant, where F (1, 3) = 223.480, p = 0.001, and explained 98.7% of the variance in the total (R² = 0.987, Adjusted R² = 0.982). The causative passives strongly predicted the overall trajectory of PVs (β = 0.993, p = 0.001), with each unit increase in causative passives associated with a 7.354-unit increase in the total.

The other passives (procedural, reporting, relational, and evaluative) were excluded from the final model as they did not significantly improve prediction beyond the contribution of the causative passive (all p > 0.05). The regression diagnostics indicated that the model met the assumptions of regression analysis. The Durbin-Watson statistic (1.322) suggested an acceptable level of independence of residuals. Examination of residual statistics showed normally distributed residuals with standardized values ranging from −1.483 to 0.756, supporting the appropriateness of the model.

In summary, the regression analysis identified the causative passive as the primary predictor of PVs in the humanities, accounting for nearly all of the explainable variance. Although other passives showed strong bivariate correlations with the total, they did not contribute unique predictive value beyond that provided by the causative passives.

Natural sciences

Table 7 shows the Pearson correlation between each type of passive and the total for the natural sciences.

Table 7 The Pearson correlation coefficients in the natural sciences.

Full size table

According to the statistics in Table 7, relational passives showed the strongest positive correlation with the total (r = 0.954, p = 0.006), followed closely by reporting passives (r = 0.897, p = 0.020) and procedural passives (r = 0.875, p = 0.026), all statistically significant. Interestingly, evaluative passives show only a moderate positive correlation (r = 0.449) without statistical significance (p = 0.224). Most notably, causative passives display a moderate negative correlation (r = −0.483, p = 0.205), suggesting that as overall linguistic density increases in the natural sciences texts, the use of causative structures tends to decrease, though this trend was not statistically significant. A further multivariate regression analysis (Table 8) helped determine which type(s) of passives significantly contributed to the total.

Table 8 Summary of multivariate regression analysis for the natural sciences.

Full size table

The results of the regression analysis confirmed the contributing role of relational passives. Table 8 shows a one-predictor model with relational passives as the only significant predictor of the overall trend, F(1, 3) = 30.525, p = 0.012. The model explained 91.1% of the variance (R² = 0.911, Adjusted R² = 0.881), demonstrating strong predictive power. The regression equation (Total = −30.599 + 8.892 (Relational)) indicated that each unit increase in relational passives was associated with an 8.892-unit increase in total, with the relational passives showing a strong positive effect (β = 0.954, t = 5.525, p = 0.012). The Durbin-Watson value of 2.651 and examination of residuals confirmed that the regression assumptions were adequately met. Other passives (procedural, reporting, evaluative, and causative) were excluded from the final model as they did not contribute significant additional predictive value beyond the relational passive, despite some showing significant bivariate correlations with the total.

In summary, the regression analysis demonstrated that the relational passives were the primary predictor of total in the natural sciences, accounting for a predominant proportion (91.1%) of the variance. Although other passives (particularly procedural and reporting) showed strong bivariate correlations with the total, they did not contribute unique predictive value beyond that provided by the relational passives, likely due to shared variance among these predictors.

Social sciences

Table 9 shows the Pearson correlation between each type of passive and the total for the natural sciences.

Table 9 The Pearson correlation coefficients in the social sciences.

Full size table

In Table 9, the total was most strongly correlated with the procedural passive (r = 0.989, p < 0.01), followed by the reporting passive (r = 0.946, p < 0.01) and the relational passive (r = 0.845, p < 0.05). The correlation with the evaluative passive approached significance (r = 0.795, p = 0.054), while the causative passive showed a non-significant relationship with the total (r = 0.317, p = 0.302). A further multivariate regression analysis (Table 10) helped determine which type(s) of passives significantly contributed to the total.

Table 10 Summary of multivariate regression analysis for the social sciences.

Full size table

In Table 10, the results of the regression analysis revealed two significant models. Model 1, with procedural passives as the sole predictor, demonstrated strong predictive power (R = 0.989, R² = 0.979, Adjusted R² = 0.971) and statistical significance (F(1, 3) = 137.125, p = 0.001). This model revealed that procedural passives were a significant predictor (β = 0.989, t = 11.710, p = 0.001), with the regression equation total = 13.887 + 1.438 (Procedural).

However, Model 2, which added causative passives as a second predictor, achieved a perfect fit (R = 1.000, R² = 1.000, Adjusted R² = 1.000) with significantly improved precision (standard error reduced from 17.33 to 1.93). This model was highly significant (F(2, 2) = 5658.006, p < 0.001), with both procedural (β = 0.963, t = 100.898, p < 0.001) and causative (β = 0.148, t = 15.502, p = 0.004) making unique, significant contributions. The final regression equation was total = 4.672 + 1.401 (procedural) + 2.093 (causative), with appropriate independence of observations confirmed by the Durbin-Watson statistic (2.082). These findings indicated that while procedural knowledge accounts for most of the variance in the social sciences performance, causal reasoning provides a smaller but crucial contribution to the predictive model.

Discussions and suggestions

Discussions

PV change and its relevance to manuscript acceptance

The documented decline in PV usage across academic disciplines in recent decades (Mair & Leech, 2006; Li, 2022) aligns in part with this study’s method-section-specific findings, particularly within social sciences (r_s = −0.975) and the humanities (r_s = −0.872). These trajectories suggest that since the 1980s—when the Plain English Campaign and broader calls for clarity began to resonate in academia—disciplinary voice preferences have undergone gradual but uneven reorientation. There is, however, a more nuanced trend in the natural sciences, where post-2000 PV stabilization (523.23 → 530.26) mirrors Leong’s (2014) prediction of a 30% passive voice equilibrium in scientific texts. Rather than a uniform shift toward informality (Hyland & Jiang, 2017), this pattern points to a discipline-specific recalibration of rhetorical norms, and these divergent curves indicate discipline-specific recalibrations and compliance pressure occurred at different historical moments.

Disciplinary paradigms have themselves shifted over time. In the humanities and social sciences, this shift has been particularly evident since the 1970s–1980s, when heated methodological debates and the seminal works of Morgan (1983) and Lincoln & Guba (1985) helped to institutionalize issues of trustworthiness, methodological transparency, and pluralism in qualitative research; subsequently, the 1990s witnessed the textbook formalization and widespread adoption of mixed-methods approaches (Creswell, 1998). Against this backdrop, humanities and social sciences have gradually evolved towards greater authorial visibility and credibility (Hyland, 2001), as their Methods/Methodology section increasingly requires researchers to rigorously justify their choice of research tools, data sources, and analytical frameworks (Baratta, 2009), as well as motivation and assessment (Seoane & Hundt, 2017). This evolving paradigm of writing naturally entails more presence of the “author as agent”, allowing readers to clearly follow the researcher’s line of reasoning and enabling more effective communication. These cumulative changes from the 1990s onward help explain the pronounced PV decline in these fields. Natural sciences, however, have the inertia to understand the Methods section as a space for procedural description, without superfluous methodological rationale already familiar to peers in the profession (Hyland, 2002). This paradigm, resistant to rhetorical personalization, explains the stabilization of PV usage even as other fields shifted more decisively toward AV.

Compliance pressures, whereby manuscripts that deviate significantly from the guidelines risk rejection and revision requests in the peer review process (Feld et al., 2024; Belcher, 2019), have also evolved over time. Whereas earlier style manuals permitted broader use of PV, more recent editions have tightened their stance for achieving better conciseness, readability, and vigor. For instance, the Publication Manual of the American Psychological Association (APA) style—mandated by many humanities journals and by over half of social sciences journals (Santos et al., 2023; Vizváry and Grigas, 2025)—did not articulate an explicit preference regarding passive voice in its pre-1980 second edition (APA, 1974), but introduced such guidance in its post-1980 third edition (APA, 1983). This is particularly the case where its sixth edition recommended “using the active rather than the passive voice” (APA, 2009: 34) and the 7th edition made this prescriptive: “Use the active voice as much as possible” (APA, 2020:42). This strengthening aligns with sharper PV declines in social-science and humanities journals after 1980. In natural sciences, Nature’s author instructions, for example, began to explicitly state: “Nature journals prefer authors to write in the active voice” (Nature, n.d.). A Nature editorial further acknowledged that scientific language was “becoming more informal and direct” (Nature, 2016), confirming that actual publishing practice was catching up with prescriptive guidance. However, some style guides still accept, prefer (Gastel & Day, 2022), and even conventionalize (Hofmann, 2016) PV usage, particularly in the Methods section. These evolving and somewhat contradictory guides for writing Methods may confuse academic writers, especially those lacking access to the latest versions, and help explain why natural sciences exhibit greater resistance to change. These shifts in official policies and editorial culture coincide with the observed discipline-specific PV trajectories: earlier and sharper declines in the humanities and social sciences versus later stabilization in the natural sciences.

Taken together, these observations suggest that PV decline reflects not only cross-sectional stylistic differences but also the temporal evolution of paradigmatic orientations and compliance pressures. Importantly, the turning points in prescriptive discourse—APA’s strengthened active-voice mandates (APA, 2009, 2020) and Nature’s active-voice preference by the 2000s (Nature, 2016, n.d.)—closely align with our observed disciplinary PV trends in the Method section. Compliance pressures have therefore not been static but have evolved alongside disciplinary paradigms, shaping both acceptance practices and stylistic norms.

Disciplinary divergences for categories of passive verbs

Our findings expose striking disciplinary divergences. In the humanities, the decline in causative passive usage—which accounts for 98.7% of the variance in humanities PV trajectories (β = 0.993, p = 0.001)—primarily drives the overall reduction in PV. This trend indicates a strategic shift toward the AV in explaining causality and describing influence or enablement, for instance, from “The… was designed as …” (HU-1990) to “Researchers designed …” (HU-2010). This shift may stem from broader social developments such as the rise of the academic accountability system (Winker et al., 2023) in recent decades, which promotes clearer identification of agents. For example, using the active voice (“Funding constraints affected our sampling”) more clearly demonstrates causality than the passive voice (“Our sampling was affected by funding constraints”), as it explicitly identifies the influencing factor. It may also relate to the growing influence of the knowledge economy and personal branding (Kucharska, 2023), where first-person statements like “we conclude” allow researchers to claim ownership and highlight their contributions. Based on this shift, instructors of humanities, for example, jurisprudence, may encourage academic writers to trim causative passives by using more first-person pronouns, particularly in such countries as China, where plain English should feature more prominently (Lin et al. (2023)). To achieve this, they may provide examples contrasting passive and active causative constructions, and design exercises that help students practice rephrasing causative passive sentences into clearer, more personal-branding-oriented AV alternatives. Such pedagogical efforts could enhance clarity and persuasiveness, addressing the frequent challenge of overly formal or impersonal phrasing.

In the social sciences, the decline in PV is driven by two passive categories: procedural and causative. The underlying factors behind the decline in causative passives may be consistent with those previously observed in the humanities. For procedural passives, a potential factor is the rise of participatory scholarly identity. The use of active constructions like “we” reflects not only this identity shift but also a broader methodological move from positivist to constructivist paradigms, emphasizing researcher-participant and writer-reader interaction (Harwood, 2005) throughout the procedural stages of experimental or empirical studies. The recent rise of digital scholar (Weller, 2011) or networked participatory (Mu et al., 2018) identity may have further driven the trend of procedural passives. We conjecture that, driven by Open Science and Responsible Research and Innovation (RRI) (Liu et al., 2022), this shift is likely to persist in the foreseeable future. Building on this trend, social sciences instructors may encourage academic writers—especially those from contexts less exposed to plain English—to confidently adopt procedural and causative actives. Instructors can cite authentic published examples such as “First, we…; second, we…; third, we…” and “we clarify…” rather than relying rigidly on locally published textbooks. Such instruction not only enhances rhetorical clarity and coherence but also helps novice writers internalize genre-appropriate authorial positioning in empirical research writing.

In the natural sciences, relational passives emerge as the key predictor of PV trajectories (β = 0.954, p = 0.012), even though procedural passives remain numerically dominant. This statistical primacy explains the field’s unique PV pattern—an initial sharp decline followed by post-2000 stabilization. The potential reason may be derived from their essential role in cross-material comparisons, resulting in an equilibrium. Notably, the overall insignificance of the PV trend (r_s = −0.462, p = 0.434) underscores a critical tension: although relational passives drive directional changes, their partial retention, alongside stable procedural passive usage. This may, undoubtedly, result from the discipline’s effort to balance between evolving stylistic preferences and core epistemological needs. It can also arise from a lack of (Gupta et al., 2022) or insufficient (Dean et al., 2015) systematic discipline-specific writing training; thus, many natural sciences researchers resort to expert modeling (Yang et al., 2019) or some even “googling+imitation” (Li, 2013), leading to technical parroting. If, therefore, conditions permit, academic writing instructors in the natural sciences may encourage students to critically examine contexts where relational passives serve key epistemological functions—particularly in comparing materials—while favoring active voice in more agentive, procedural contexts. Such nuanced instruction can not only avoid technical parroting but also foster stylistic awareness without undermining the communicative precision required in scientific reporting.

The disciplinary-specific decline in PV usage underscores the need for tailored pedagogical strategies that balance evolving stylistic norms with disciplinary epistemological demands.

Recommendations for discipline-specific PV usage in academic writing

To equip academic writers with adaptive skills, we propose three recommendations, each addressing critical gaps in current practices.

Consulting journal-specific or publisher-specific style guides

Despite the general decline in PV, adherence to specific guidelines is critical for academic writers navigating divergent disciplinary norms. For instance, while Bioresource Technology follows Elsevier’s (n.d.) guidelines against PV, BioResources, a journal of similar aims and scopes published under the auspices of North Carolina State University, recommends the use of PV (BioResources, n.d.). Interestingly, Biofuels, Bioproducts and Biorefining, a journal also in this profession, follows Wiley-Blackwell House Style guide (2007), which allows authors to choose their preferred voice, serving as a compromise in the AV-vs-PV dilemma. These divergent policies reflect a broader negotiation between evolving stylistic trends (e.g., AV for clarity) and entrenched disciplinary conventions (e.g., PV for objectivity). By mandating compliance, journals implicitly train authors to dynamically reconcile these forces—preserving traditional methodological rigor while adopting communicative innovations. Such meticulous compliance has proven effective: studies indicate that explicit AV guidelines reduce PV usage by 8.07% (Leong, 2014), with recent analyses reporting even higher reductions of 11% (Fauziah & Bashtomi, 2024). By echoing syntactic choices with target journal policies, writers ensure compliance with editorial standards, a prerequisite for manuscript acceptance (Fauziah & Bashtomi, 2024).

Prioritizing discipline-sensitive writing education

Given nuanced cross-disciplinary differences as empirically evidenced by Dong et al. (2024), research programs should offer discipline-sensitive writing education, instead of generic writing courses. For instance, social sciences benefit from the AV training to accentuate researcher agency, aligning with Hyland’s (2001) authorial visibility framework. Conversely, natural sciences require modules preserving relational passives, while some scientists need to get educated against the overuse of the PV. This pedagogical differentiation actively balances stylistic evolution with disciplinary integrity: it safeguards field-specific conventions while accommodating the growing preference for authorial visibility through targeted AV adoption.

We further recommend a collaborative teaching model for academic writing courses, where experienced, discipline-savvy writers co-design and co-teach with a linguist familiar with the subject matter. This approach, recently proven to be effective (Yan et al., 2024), enables tactical voice adjustments. For example, a teacher of engineering paper writing may collaborate with an engineering-informed linguist to convert some non-core relational or procedural passives in the Methods section into the active. This will retain PV for apparatus-centric steps (e.g., “the alloy was cooled”) to uphold methodological rigor, but using AV for human interventions (e.g., “we compared the specimens”). Such calibrated conversions can improve readability, reduce syntactic complexity, and clarify authorial contributions. Overall, this strategy enables writers to meet editorial standards for clarity without compromising disciplinary conventions.

Using hands-on authentic case materials

When dissecting passive voice trade-offs, instructors should utilize hands-on authentic case materials, as they can enhance engagement (Toogood, 2023), aid in knowledge building (Kim et al., 2006) on the PV, and foster genre awareness (Hyland, 2019). Critically, these materials serve as concrete negotiation spaces where writers learn to reconcile stylistic evolution (e.g., clarity-driven AV adoption) with discipline-specific rhetorical needs (e.g., PV for procedural focus). For instance, teachers who are also manuscript polishers may, under non-infringement, use recent bi-version materials to illustrate how strategically converting non-essential procedural passives (e.g. To investigate the acoustic properties…, …were carried out) into the active (To investigate the acoustic properties…, we conducted…, Liu et al., 2022) helps eliminate dangling modifiers, instead of advocating uniform use of procedural passives. This targeted approach avoids blanket prescriptions, instead training writers to make context-sensitive choices that honor both communicative clarity and methodological conventions. If such materials are unavailable, instructors may resort to student researchers’ draft manuscripts (Miró-Colmenárez et al., 2025), guiding learners to identify when PV sustains disciplinary integrity versus where AV advances contemporary readability metrics, including voice use as experimentally found by Millar & Budgell (2019).

To sum up, the choice between PV and AV in the Methods/Methodology section of academic writing represents a dynamic negotiation between evolving stylistic norms and entrenched disciplinary conventions. Our three recommendations emphasize that writers must proactively reconcile two competing forces through metalinguistic awareness: the emerging clarity-driven paradigm and established disciplinary methodologies. Such adaptation transcends technical rhetoric, constituting a cognitive process of engaging with disciplinary discourse communities.

Conclusion

In this study, we conducted a diachronic analysis of passive voice (PV) usage in the Method(s)/Methodology sections of high-impact SSCI/SCI journals from the natural sciences, social sciences, and humanities over a 40-year period (1980–2020). Overall, the humanities and social sciences exhibited a pronounced decline in PV usage, reflecting a shift toward a more active, direct style that enhances reader engagement and authorial presence. In contrast, the natural sciences demonstrated a sharp drop in PV frequency from 1980 to 2000, followed by stabilization from 2010 onward, suggesting a discipline-specific balance between objectivity and clarity in methodological descriptions. Multivariate regression analyses revealed that in the humanities, causative passives were the primary drivers of this trend, while in the natural sciences, relational passives emerged as key predictors. In the social sciences, a combination of procedural and causative passives explained the majority of the variance. These results provide robust empirical evidence that academic writing conventions have evolved in a discipline-sensitive manner, highlighting the need for tailored writing strategies that accommodate both stylistic innovation and the functional demands of research reporting.

This study contributes to the field by merging diachronic corpus analysis with a functionally-classified model of passive structures. The findings refine our understanding of PV change in academic discourse and offer practical insights for academic writing instruction. By pinpointing which types of passive constructions drive stylistic shifts, the study supports the development of discipline-sensitive writing courses. These contributions are crucial for enhancing manuscript clarity and improving pedagogical practice in academic writing.

Despite its contributions, the study has several limitations. First, our exclusion of post-1980 journals neglects methodological conventions in emergent fields (e.g. AI, data analytics, synthetic biology, environmental engineering), potentially underrepresenting recent shifts toward agent-oriented descriptions. Future research may (1) sample emergent-field journals to determine whether these disciplines exhibit distinct voice patterns, and (2) perform comparative analyses between legacy and newly launched journals to test if younger titles show a stronger move toward active constructions.

Second, by focusing solely on high-impact SSCI/SCI journals, the study potentially introduces selection bias by excluding non-core, non-English, and open-access publications. This narrow focus confines our analysis to flagship Anglophone, subscription-based outlets, overlooking regional editorial norms, language-specific voice preferences, and interdisciplinary conventions, which may diverge markedly from those in top-tier journals. Future research may expand the corpus to include journals from non-English-speaking contexts, open-access venues, and lower-impact journals to test the universality of these trends.

Moreover, while the study reveals potential underlying factors for trends in PV usage and verb categories across academic branches, it fails to empirically or experimentally delve into authors’ voice choices, for instance, such as their in-the-moment cognitive deliberations, the trade-off between clarity and authorial presence, and how prior training and peer feedback shape their AV/PV selection. Incorporating such methods as interviews and surveys in future studies could provide deeper insights into the attitudes and decision-making processes that shape AV/PV choices.

Data availability

The extracted passive voice structures are available at Supplementary Data 1, Supplementary Data 2, and Supplementary Data 3 for HU, NS, and SS.

References

Abu Qub’a A, Yousef Abu Guba MN, Fareh SI (2024) Exploring the use of Grammarly in assessing English academic writing. Heliyon. https://doi.org/10.1016/j.heliyon.2024.e34893
American Psychological Association (APA) (1974) Publication manual of the American Psychological Association, 2nd edn. American Psychological Association
American Psychological Association (APA) (1983) Publication manual of the American Psychological Association, 3rd edn. American Psychological Association
American Psychological Association (APA) (2009) Publication manual of the American Psychological Association, 6th edn. American Psychological Association
American Psychological Association (APA) (2020) Publication manual of the American Psychological Association, 7th edn. American Psychological Association
Anthony L (2022) AntConc (Version 4.2.0) [Computer software]. Waseda University
Banks D (2017) The extent to which the passive voice is used in the scientific journal article, 1985–2015. Functional Linguistics. https://doi.org/10.1186/s40554-017-0045-5
Baratta A (2009) Revealing stance through passive voice. J Pragmat 41:1406–1421. https://doi.org/10.1016/j.pragma.2008.09.010
Article Google Scholar
Becher T, Trowler PR (2001) Academic tribes and territories: Intellectual enquiry and the culture of disciplines, 2nd edn. Open University Press
Belcher WL (2019) Writing your journal article in twelve weeks: A guide to academic publishing success. University of Chicago Press
BetterEvaluation (2012) How to write in plain English. https://www.betterevaluation.org/tools-resources/how-write-plain-english?utm_source=chatgpt.com
Biber D, Clark V (2002) Historical shifts in modification patterns with complex noun phrase structures: How long can you go without a verb? In: Fanego T, López-Couso MJ, Pérez-Guerra J (eds) English historical syntax and morphology. John Benjamins, pp 43–66
Biber D, Johansson S, Leech G (1999) Longman grammar of spoken and written English. Longman
BioResource (n.d.) Writing style suggestions. https://bioresources.cnr.ncsu.edu/authors-and-reviewers/writing-style-suggestions/
Campbell N (1999) How New Zealand consumers respond to plain English. J Bus Commun (1973) 36(4):335–358. https://doi.org/10.1177/002194369903600402
Article Google Scholar
Chan EY, Maglio SJ (2020) The voice of cognition: active and passive voice influence distance and construal. Personality and Social Psychology Bulletin. https://doi.org/10.1177/0146167219867784
Cooray M (1967) The English passive voice. English Language Teaching Journal. https://doi.org/10.1093/elt/XXI.3.203
Creswell JW (1998) Qualitative inquiry and research design: Choosing among five approaches. SAGE Publications
Cutts M (2020) Oxford guide to plain English, 5th edn. Oxford University Press
Dastjerdi ZS, Tan H, Ebrahimi SF (2021) Voice in rhetorical units of results and discussion chapter of master’s theses: Across science study. Brno Studies in English. https://doi.org/10.5817/BSE2021-1-5
Dean E, Nordgren L, Söderlund A (2015) An exploration of the scientific writing experience of nonnative English-speaking doctoral supervisors and students using a phenomenographic approach. J Biomed Educ 2015:542781–11 pages. https://doi.org/10.1155/2015/542781
Article Google Scholar
DeRespinis F, Hayward P, Jenkins A (2012) The IBM style guide: Conventions for writers and editors. IBM Press
Djuwari D, Saputri AR, Authar NM (2022) Passive voice in the methodology sections of research journal articles. J Posit Sch Psychol 6(3):1543–1551
Google Scholar
Dong S, Mao J, Ke Q (2024) Decoding the writing styles of disciplines: a large-scale quantitative analysis. Information Processing & Management. https://doi.org/10.1016/j.ipm.2024.103718
Dumin LM (2010) Changes in the use of the passive voice over time: a historical look at the American Journal of Botany and the changes in the use of the passive voice from 1914–2008. Doctoral dissertation, Oklahoma State University
Elsevier (n.d.) Writing style guidelines. https://legacyfileshare.elsevier.com/promis_misc/adveiwrsty051203.pdf
Farrelly M, Seoane E (2012) Democratisation. In: Nevalainen T, Traugott E (eds) The Oxford handbook of the history of English. Oxford University Press, pp 579–592
Fauziah H, Bashtomi Y (2024) “Expunge virtually all use of the passive voice”: How does style guideline affect passive voice occurrences in research articles? Journal of Language and Education. https://doi.org/10.17323/jle.2024.14403
Feld J, Lines C, Ross L (2024) Writing matters. J Econ Behav Organ 217:378–397. https://doi.org/10.1016/j.jebo.2023.11.016
Article Google Scholar
Gastel B, Day RA (2022) How to write and publish a scientific paper. 9th edn. Greenwood
Grasso A (2018) Plain English and the EU: Still trying to fight the fog? In: Marino S, Biel Ł, Bajčić, M, Sosoni, V (eds) Language and Law. Springer International Publishing, pp 359–376
Gries ST (2021) Ten lectures on quantitative approaches in cognitive linguistics: Corpus-linguistic, experimental, and statistical applications. Brill
Gupta S, Jaiswal A, Paramasivam A, Kotecha J (2022) Academic writing challenges and supports: Perspectives of international doctoral students and their supervisors. Front Educ 7:891534. https://doi.org/10.3389/feduc.2022.891534
Article Google Scholar
Halliday MAK (1994) An introduction to functional grammar, 2nd edn. Edward Arnold
Halliday MAK, Martin JR (1993) Writing science: Literacy and discursive power. University of Pittsburgh Press
Halliday MAK, Matthiessen CMIM (2014) Halliday’s introduction to functional grammar. Routledge
Harwood N (2005) “Nowhere has anyone attempted … In this article I aim to do just that”: a corpus-based study of self-promotional I and we in academic writing across four disciplines. Journal of Pragmatics. https://doi.org/10.1016/j.pragma.2005.01.012
Hilpert M, Gries ST (2016) Quantitative approaches to diachronic corpus linguistics. In: Hundt M, Mollin S, Pfenninger S (eds) The changing English language: Psycholinguistic perspectives. Cambridge University Press, pp 245–268
Hofmann A (2016) Scientific writing and communication: Papers, proposals, and presentations. 3rd ed. Oxford University Press
Hundt M, Röthlisberger M, Seoane E (2021) Predicting voice alternation across academic Englishes. Corpus Linguistics and Linguistic Theory. https://doi.org/10.1515/cllt-2017-0050
Hundt M, Schneider G, Seoane E (2016) The use of the be-passive in academic Englishes: Local versus global usage in an international language. Corpora. https://doi.org/10.3366/cor.2016.0084
Hunston S, Sinclair J (2000) A local grammar of evaluation. In: Hunston S, Thompson G (eds) Evaluation in text: Authorial stance and the construction of discourse. Oxford University Press, pp 74–101
Hyland K (1999) Academic attribution: Citation and the construction of disciplinary knowledge. Applied Linguistics. https://doi.org/10.1093/applin/20.3.341
Hyland K (2001) Humble servants of the discipline? Self-mention in research articles. English for Specific Purposes. https://doi.org/10.1016/S0889-4906(00)00012-0
Hyland K (2002) Authority and invisibility: authorial identity in academic writing. Journal of Pragmatics. https://doi.org/10.1016/S0378-2166(02)00035-8
Hyland K (2019) Second language writing. Cambridge University Press
Hyland K, Jiang F (2017) Is academic writing becoming more informal? English for Specific Purposes. https://doi.org/10.1016/j.esp.2016.09.001
Kim S, Phillips WR, Pinsky L (2006) A conceptual framework for developing teaching cases: A review and synthesis of the literature across disciplines. Medical Education. https://doi.org/10.1111/j.1365-2929.2006.02544.x
Kimble J (2012) Writing for Dollars, Writing to Please: The Case for Plain Language in Business, Government, and Law. Carolina Academic Press
Kucharska W (2023) Personal branding in the knowledge economy: The inter-relationship between corporate and employee brands. Routledge
Leech G, Hundt M, Mair C (2009) Change in contemporary English: a grammatical study. Cambridge University Press
Leong PA (2014) The passive voice in scientific writing: The current norm in science journals. Journal of Science Communication. https://doi.org/10.22323/2.13010203
Leong PA (2020) The passive voice in scientific writing through the ages: A diachronic study. Text & Talk. https://doi.org/10.1515/text-2020-2066
Levshina N (2015) How to do linguistics with R: Data exploration and statistical analysis. John Benjamins
Li Y (2013) Text-based plagiarism in scientific writing: What Chinese supervisors think about copying and how to reduce it in students’ writing. Sci Eng Ethics 19(2):569–583. https://doi.org/10.1007/s11948-011-9342-7
Article CAS PubMed Google Scholar
Li Z (2022) Is academic writing less passivized? Corpus-based evidence from research article abstracts in applied linguistics over the past three decades (1990–2019). Scientometrics. https://doi.org/10.1007/s11192-022-04498-0
Lim JM (2017) Writing descriptions of experimental procedures in language education: Implications for the teaching of English for academic purposes. English for Specific Purposes. https://doi.org/10.1016/j.esp.2017.05.001
Lin X, Afzaal M, Aldayel HS (2023) Syntactic complexity in legal translated texts and the use of plain English: a corpus-based study. Humanities Soc Sci Commun 10:17. https://doi.org/10.1057/s41599-022-01485-x
Article Google Scholar
Lincoln YS, Guba EG (1985) Naturalistic inquiry. Sage Publications
Liu J, Zhang G, Lv X, Li J (2022) Discovering the landscape and evolution of responsible research and innovation (RRI): Science mapping based on bibliometric analysis. Sustainability 14(14):8944. https://doi.org/10.3390/su14148944
Article ADS Google Scholar
Mair C, Leech G (2006) Current change in English syntax. In: Aarts B, MacMahon A (eds) The handbook of English linguistics. Blackwell, pp 318–342
McEnery T, Wilson A (2001) Corpus Linguistics: An Introduction. 2nd edn. Edinburgh University Press
McHugh ML (2012) Interrater reliability: The kappa statistic. Biochemia Med 22(3):276–282
Article MathSciNet Google Scholar
Millán EL (2010) “Extending this claim, we propose…” The writer’s presence in research articles from different disciplines. Ib érica 20:35–56
Google Scholar
Millar N, Budgell BS (2019) The passive voice and comprehensibility of biomedical texts: An experimental study with 2 cohorts of chiropractic students. J Chiropr Educ 33(1):16–20. https://doi.org/10.7899/JCE-17-22
Article PubMed PubMed Central Google Scholar
Millar N, Budgell B, Fuller K (2013) “Use the AV whenever possible”: The impact of style guidelines in medical journals. Applied Linguistics. https://doi.org/10.1093/applin/ams059
Miró-Colmenárez PJ, Durán-Alonso S, Díaz-Cruces E (2025) Enhancing university teaching through student-led review articles as a pathway to early research engagement. Education Sciences. https://doi.org/10.3390/educsci15020249
Morgan G (1983) Beyond method: Strategies for social research. Sage Publications
Mu GM, Zhang H, Cheng W (2018) Negotiating scholarly identity through an international doctoral workshop: A cosmopolitan approach to doctoral education. J Stud Int Educ 23(1):139–153. https://doi.org/10.1177/1028315318810840
Article Google Scholar
Nature (2016) Scientific language is becoming more informal. Nature 539(7628):140. https://doi.org/10.1038/539140a
Article CAS Google Scholar
Nature (n.d.) How to write your paper. https://www.nature.com/nature-portfolio/for-authors/write#:~:text=Nature%20journals%20prefer%20authors%20to,more%20clearly%20if%20written%20directly
Penrose A, Katz S (2010) Writing in the sciences: Exploring conventions of scientific discourse, 3rd edn. Longman
Perelman LC, Paradis J, Barrett E (1998) The Mayfield handbook of technical & scientific writing. Mayfield Publishing
Praminatih G, Kwary D, Ardaniah V (2018) Is EFL students’ academic writing becoming more informal? Journal of World Languages. https://doi.org/10.1080/21698252.2019.1570664
Quirk R, Greenbaum S, Leech G (1985) A comprehensive grammar of the English language. Longman
Santos EAD, Peroni S, Mucheroni ML (2023) An analysis of citing and referencing habits across all scholarly disciplines: approaches and trends in bibliographic referencing and citing practices. J Document 79(7):196–224. https://doi.org/10.1108/JD-10-2022-0234
Article Google Scholar
Schmid H (1994) Probabilistic part-of-speech tagging using decision trees. Proceedings of the International Conference on New Methods in Language Processing. Manchester, UK
Schober P, Boer C, Schwarte LA (2018) Correlation coefficients: Appropriate use and interpretation. Anesthesia & Analgesia. https://doi.org/10.1213/ANE.0000000000002864
Schultz DM (2009) Eloquent science: A practical guide to becoming a better writer, speaker, and atmospheric scientist. American Meteorological Society
Schuster E, Levkowitz H, Oliveira R (eds) (2014) Writing scientific research articles in English successfully: Your complete roadmap. Hyprtek.com
Schwarz S (2019) “This must be looked into”: a corpus study of the prepositional passive. Journal of English Linguistics. https://doi.org/10.1177/0075424219851837
Seoane E, Loureiro-Porto L (2005) On the colloquialization of scientific British and American English. ESP Across Cult 2:106–118
Google Scholar
Seoane E, Hundt M (2017) Voice alternation and authorial presence: variation across disciplinary areas in academic English. J Engl Linguist 46(1):3–22. https://doi.org/10.1177/0075424217740938
Article Google Scholar
Seoane E (2006) Changing styles: On the recent evolution of scientific British and American English. In: Dalton-Puffer C, Kastorsky D, Ritt N et al. (eds) Syntax, style and grammatical norms: English from 1500–2000. Peter Lang, pp 191–209
Seoane E (2013) On the conventionalisation and loss of the passive voice in Late Modern English scientific discourse. Journal of Historical Pragmatics. https://doi.org/10.1075/jhp.14.1.03seo
Seoane E, Williams C (2006) Changing the rules: a comparison of recent trends in English in academic scientific discourse and prescriptive legal discourse. In: Dossena M, Taavitainen I (eds) Diachronic perspectives on domain-specific English. Peter Lang, pp 255–276
Sepehri AA, Markowitz DM, Mir M (2023) PassivePy: a tool to automatically identify passive voice in big text data. Journal of Consumer Psychology. https://doi.org/10.31234/osf.io/bwp3t
Supply and Services Canada (1994) Plain language: clear and simple–Trainer’s guide. https://publications.gc.ca/collections/collection_2012/ps-sp/MP95-2-1-1994-eng.pdf
Tarone EE, Dwyer S, Gillette S (1998) On the use of the passive and active voice in astrophysics journal research articles: With extensions to other languages and other fields. English for Specific Purposes. https://doi.org/10.1016/S0889-4906(97)00032-X
The Securities and Exchange Commission (1998) A plain English handbook: How to create SEC disclosure documents. The Office
The United Nations (1984) A guide to writing for the United Nations. https://digitallibrary.un.org/record/134840?ln=en&v=pdf
Toogood C (2023) Supporting students to engage with case studies: a model of engagement principles. Educational Review. https://doi.org/10.1080/00131911.2023.2281227
University of Exeter (n.d.) Style guide. https://www.exeter.ac.uk/staff/web/writingfortheweb/styleguide/?utm_source=chatgpt.com
Vizváry P, Grigas V (2025) Unravelling citation rules: a comparative analysis of referencing instruction patterns in Scopus-indexed journals. Learned Publ 38(2):e1661. https://doi.org/10.1002/leap.1661
Article Google Scholar
Weller M (2011) The digital scholar: How technology is changing academic practice, 1st edn. Bloomsbury Academic
Wheeler MA, Vylomova E, McGrath MJ (2021) More confident, less formal: Stylistic changes in academic psychology writing from 1970 to 2016. Scientometrics. https://doi.org/10.1007/s11192-021-04166-9
Wiley-Blackwell (2007) Wiley-Blackwell House style guide. The Charlesworth Group
Winker MA, Bloom T, Onie S (2023) Equity, transparency, and accountability: open science for the 21st century. Lancet 402:1206–1209. https://doi.org/10.1016/S0140-6736(23)01575-1
Article PubMed Google Scholar
Yan Li W, Yeh FP (2024) Reinforcing writing in the disciplines courses with collaborative instructional mode: An exploratory study. SAGE Open. https://doi.org/10.1177/21582440241237842
Yang A, Stockwell S, McDonnell L (2019) Writing in your own voice: An intervention that reduces plagiarism and common writing problems in students’ scientific writing. Biochem Mol Biol Educ 47:589–598. https://doi.org/10.1002/bmb.21282
Article CAS PubMed Google Scholar

Download references

Acknowledgements

The research is supported by Bureau of Education of Guangzhou Municipality (Grant No.: 2024312201) and Department of Education, Fujian Province (Grant No.: JSZW24048).

Author information

Authors and Affiliations

Guangzhou University, Guangzhou, China
Rurong Le
Xiamen University Tan Kah Kee College, Zhangzhou, China
Sheng Yu & Mingxing Hao

Authors

Rurong Le
View author publications
Search author on:PubMed Google Scholar
Sheng Yu
View author publications
Search author on:PubMed Google Scholar
Mingxing Hao
View author publications
Search author on:PubMed Google Scholar

Contributions

RL: conceptualization, investigation, methodology, software, validation, formal analysis, resources, data curation, writing—original draft preparation, writing—review and editing, supervision, project administration, funding acquisition. SY: conceptualization, investigation, methodology, software, formal analysis, resources, data curation, writing—original draft preparation, writing—review and editing, project administration. MH: investigation, methodology, formal analysis, funding acquisition, writing—review and editing. RL and SY contributed equally to this work and should be considered co-first authors. All the authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Sheng Yu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

The study does not involve human participants or their data.

Informed consent

Not applicable to this study as it did not involve human participants.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary file (download DOCX )

Supplementary Data 1 (download XLSX )

Supplementary Data 2 (download XLSX )

Supplementary Data 3 (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Le, R., Yu, S. & Hao, M. A corpus-based study of passive voice trajectories in methods sections across three academic branches (1980–2020). Humanit Soc Sci Commun 12, 1805 (2025). https://doi.org/10.1057/s41599-025-06007-z

Download citation

Received: 20 March 2025
Accepted: 18 September 2025
Published: 21 November 2025
Version of record: 21 November 2025
DOI: https://doi.org/10.1057/s41599-025-06007-z

Subjects

Abstract

Similar content being viewed by others

A discourse-historical analysis of philanthropic legitimacy construction in rural China

Public values in public R&D through natural language processing

Long-term effects of a double hit murine model for schizophrenia on parvalbumin expressing cells and plasticity-related molecules in the thalamic reticular nucleus and the habenula

Introduction

Methodology

The corpora

Corpus design and text selection criteria

Corpus compilation

Extracting PV structures

Manually identifying all sub-corpora

Extracting passives with corpus tools

Manual validation

Classifying passive verbs

Statistical tests

Spearman correlation tests

Pearson correlation tests and multivariate regression analysis

Results

The diachronic development of passives

The contributing passive categories to the diachronic trends

Humanities

Natural sciences

Social sciences

Discussions and suggestions

Discussions

PV change and its relevance to manuscript acceptance

Disciplinary divergences for categories of passive verbs

Recommendations for discipline-specific PV usage in academic writing

Consulting journal-specific or publisher-specific style guides

Prioritizing discipline-sensitive writing education

Using hands-on authentic case materials

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical approval

Informed consent

Additional information

Supplementary information

Supplementary file (download DOCX )

Supplementary Data 1 (download XLSX )

Supplementary Data 2 (download XLSX )

Supplementary Data 3 (download XLSX )

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links