Introduction

The passive voice (PV) has been a staple of academic writing for years, especially in scientific discourse, where it has long been favoured for its air of objectivity and impersonality (Cooray, 1967; Banks, 2017), and sometimes, for cohesion (Baratta, 2009; Leong, 2020). Recent decades, however, have seen a significant shift in writing conventions, with a growing preference for the active voice (AV).

One of the major social-cultural reasons for this evolving landscape is the current trend of a reaction against the more detached, abstract, and alienating style characteristic of academic writing (Farrelly & Seoane, 2012; Halliday & Martin, 1993; Seoane, 2013), exemplified by the Plain English Campaign. Thus, style guides (e.g. DeRespinis et al., 2012; American Psychological Association APA (2020); Dastjerdi et al., 2021; Fauziah & Bashtomi, 2024) and writing textbooks (e.g. Perelman et al., 1998; Schultz, 2009; Schuster et al., 2014) have increasingly advocated for the AV, some even recommending that writers expunge all uses of the PV. A second key factor is the intense competition within the scientific community. In the era of the ‘informational explosion’ (Biber & Clark, 2002: 63–64), the increasing pressure to communicate efficiently and clearly has forced writers to adapt, prioritizing clarity, brevity, and reader engagement (Penrose & Katz, 2010). Indeed, research has shown that texts in the PV increase readers’ perceived temporal, hypothetical, spatial (Praminatih et al., 2018), and psychological distance (Chan & Maglio, 2020; Sepehri et al., 2023) from activities described in the text. For the sake of reader engagement, even grammar checkers such as Grammarly (Abu Qub’a et al., 2024) now flag passive constructions, recommending active alternatives.

This trend is not merely anecdotal; corpus-based studies have documented a general decline in passive constructions across various registers, including academic prose (Mair & Leech, 2006; Leech et al., 2009; Banks, 2017; Wheeler et al., 2021; Li, 2022), though some of these (e.g. Mair & Leech, 2006; Leech et al., 2009; Hyland & Jiang, 2017) only include the academic passive as a small component of their syntactic change. Indeed, corpus researchers such as Seoane and her collaborators (Seoane & Loureiro-Porto, 2005; Seoane, 2006; Seoane & Williams, 2006) began to give special attention to the PV, recording a marked decrease in passive usage in scientific texts. More recently, Wheeler et al. (2021), in a corpus-based study of academic psychology abstracts from 1970 to 2016, found a dramatic 92% surge in the use of personal pronouns, providing strong evidence of a shift away from the traditional PV dominance in academic writing. Using a corpus of 2,707 research article abstracts in applied linguistics from 1990 to 2019, Li (2022) found a declining trend in PV usage, especially in co-authored ones. These findings suggest a shift towards a more informative, efficient, and reader-friendly academic writing style—a broader movement from the traditional image of distance and impersonality, toward a more visible authorial presence, often signaled by the use of self-promotional first-person pronouns (Hyland, 2001, 2002; Harwood, 2005). Despite these insights, several critical limitations remain in the literature.

First, many studies have limited sample sizes, which constrain the generalizability of their findings. For example, Tarone et al. (1998) research, while trailblazing, only analyzed two research articles published in The Astrophysical Journal. Banks’ (2017) analysis of multi-authored research articles, while insightful, was based on only 32 articles, raising questions about its broader applicability. Dumin (2010) sampled 15 articles from The American Journal of Botany with 5 for each of three time periods 1914–1918, 1962–1966, and 2004–2008.

Second, a significant portion of the multi-disciplinary research on PV is synchronic, for example, Leong’s (2014) examination on the PV in 60 research articles from six science journal, Hundt et al., 2021 focus on the choice of PV across different varieties of academic Englishes, and Seoane & Hundt’s (2017) association of PV with authorial presence across disciplinary areas. While these synchronic, multi-disciplinary investigations provide valuable insights, they fail to capture the dynamic evolution of writing conventions over time.

Third, even among diachronic studies, most focus on a single discipline or a narrow range of fields (Leong, 2020; Wheeler et al., 2021; Li, 2022), except Hyland & Jiang’s (2017) diachronic, multi-disciplinary research on ten linguistic features commonly associated with informality.

Fourth, the majority of existing medium-to-large-sized corpus research has focused on full-length articles (e.g. Banks, 2017; Hyland & Jiang, 2017; Millán, 2010; Millar et al., 2013; Leong, 2020) or used a wide spectrum of texts (Hundt et al., 2016; Hundt et al., 2021), with only a handful of studies, such as Li’s (2022), examining a particular section of research articles. This is particularly problematic given that the Method(s)/Methodology section remains understudied. The gap is significant. For one thing, this section has been shown to contain a higher frequency of passive constructions (Millar et al., 2013; Leong, 2014; Banks, 2017; Fauziah & Bashtomi, 2024), prevalent across sentence, clause, and phrase levels (Djuwari et al., 2022). For the other, the Method(s)/Methodology section is often where the passive voice is most functionally appropriate, emphasizing processes over agents.

Moreover, while prior research has provided theoretical (Quirk et al., 1985) and qualitative function-based (Lim, 2017) categorizations of passives, no study to date has quantitatively investigated which functional subtypes of passive constructions significantly drive changes in PV usage over time. Thus, a functionally categorized, corpus-driven quantitative approach may provide statistical evidence to identify which passive types significantly drive PV trends, thereby offering empirically grounded guidelines for academic writing instruction.

Taken together, the existing literature has captured a trend toward a more personal, informal academic writing style, though it has failed to address the overall landscape of passive/active use and its dominant passive subtypes in high-quality academic writing, particularly the Methods/Methodology section. Building on this gap, this corpus-based study undertakes a diachronic analysis of PV usage in the Method(s)/Methodology sections of SSCI/SCI-indexed journal articles spanning the natural sciences, the social sciences, and the humanities—the three major branches of modern academia (Becher & Trowler, 2001).

More specifically, this study focuses on five benchmark years—1980, 1990, 2000, 2010, and 2020—spanning four decades for two main reasons. First, decade-long intervals are a well-established convention in corpus linguistics, striking a balance between capturing long-term trends and preserving analytical clarity (McEnery & Wilson, 2001). Second, beginning in 1980—immediately after the Plain English Campaign’s institutional launch in 1979—provides a clear temporal marker for detecting any subsequent stylistic shifts. This initial post-Campaign period also saw its influence in countries such as New Zealand (Campbell, 1999), the US (The Securities and Exchange Commission, 1998), Canada (Supply and Services Canada, 1994), and within international bodies like the UN (1984) and the European Commission (Grasso, 2018). In academic contexts, some institutional style guides (e.g., University of Exeter, n.d.; BetterEvaluation, 2012) explicitly adopt the Campaign’s definition of plain English—clear, concise, and reader-focused language. As these principles gained institutional endorsement, academic style manuals increasingly promoted features such as active voice and first-person pronouns while discouraging passive constructions. Over time, such stylistic recommendations began to permeate scholarly writing, contributing to a broader shift away from impersonal, passive formulations (Kimble, 2012; Cutts, 2020).

Overall, by examining the Methods/Methodology section of SSCI/SCI-indexed journals, this study aims to provide a diachronic, cross-disciplinary perspective on how PV usage has evolved in academic writing over the past four decades. This study aims to answer the following research questions (RQs).

RQ 1: What are the trends of PV in high-quality academic Methods/ Methodology sections in the natural sciences, the social sciences, and the humanities since 1980?

RQ 2: If there are changes in voice preference, are they statistically significant in the three academic branches?

RQ 3: Which type(s) of passive contribute(s) to PV change in the three academic disciplines?

Methodology

This section includes four parts. The first part is the process of corpus compilation and the second, our coding of passive structures in the corpora. The last two parts describe how we classified the passive structures and the statistical tests we conducted.

The corpora

This part details how we compiled our corpora. The process includes the data source and our selection criteria. We also report the difficulties of data selection and the compromises we made.

Corpus design and text selection criteria

Our corpora were compiled from the Method(s)/Methodology section of research articles selected from international peer-reviewed journals that meet the following requirements.

First, the journals were selected based on impact factor rankings in Web of Science citation reports, prioritizing high-quality publications that make significant contributions to academic discourse. We also considered the diversity of publishers to include a range of writing style guidelines.

Second, journals needed sufficient historical presence to support our diachronic analysis. We sampled research articles at decade intervals from 1980 to 2020 (1980, 1990, 2000, 2010, and 2020). The reason for choosing 1980 as the starting year was to examine the influence of the Plain English Campaign in the late 1970s, which advocated the AV over the passive. As a result, all selected journals were established before 1980 (see Appendix A for the complete journal list).

Third, each research article must contain an independent section for methods or methodology. We focus on the Method(s)/Methodology section for two reasons. First, it receives relatively less attention than such sections as the abstract, introduction, and discussion. Second, it can be seen as ideal for studying passive structures because it is the venue where researchers highlight the instruments, processes, samples, and subjects, but the self-clear or less important agent(s) may be obscured or omitted.

Applying the above three criteria, we sampled the research articles starting from the first paper of the first issue of that year. We collected five research articles per journal in each period, resulting in 450 research articles whose combined Method(s)/Methodology sections comprised 528,830 tokens in 25,989 sentences.

During the sampling, we made several compromises to collect the data we needed. First, as many early research articles, especially those published in 1980 and 1990, lack a clear or independent Method(s)/Methodology section, we resorted to the 1981/1991 and 1982/1992 issues when those from 1980/1990 did not provide us with sufficient research articles meeting our criteria. In these early research articles, an independent section titled “The data” was also considered as the Method(s)/Methodology section since we found that their description of the data accounted for an important part of the Method(s)/Methodology section. Second, we selected 7–8 research articles from each history journal as compensation for the number of history research articles because we succeeded in accessing only two history journals—instead of three, as we did in other disciplines—whose research articles describe the methodology. Third, we included some not-so-top journals because the top ones are too young to be included in this research.

Corpus compilation

Most articles were downloaded in PDF format and converted to editable Word documents. From the Word documents, we copied the Method(s)/Methodology section to three new documents named “HU” (for the humanities), NS (for the natural sciences), and SS (for the social sciences) for the five years selected. The Method(s)/Methodology sections of open-access research articles, however, were directly copied from their accessible webpages.

However, not all the parts in the Method(s)/Methodology sections were put into the corpora. For some the humanities and social science research articles, we deleted whole-paragraph citations and the narration from the subjects because they were not the authors’ own words.

After completing all the work, we converted the Word documents into plain-text files compatible with corpus analysis tools. Following this, AntConc 4.2.0 (Anthony, 2022) was utilized to quantify PV across the corpus. As summarized in Table 1, the final dataset comprised 15 disciplinary-temporal sub-corpora.

Table 1 The volume of each sub-corpus (number of tokens/number of sentences).

Extracting PV structures

Subsequent to corpus compilation, the passives were manually annotated, and then coded mainly through TreeTagger for part-of-speech tagging (Schmid, 1994) and AntConc 4.2.0 for PV extraction (Anthony, 2022). Following standard syntactic definitions (Biber et al., 1999; Quirk et al., 1985), we operationalized PV as be-passive constructions—clauses where the grammatical subject receives the action expressed by a verb phrase combining a form of be (am/are/is/was/were/be/been/being) with a past participle. Two categories were therefore excluded: (1) get-passives (e.g., the sample got tested), which are statistically marginal in academic writing (Biber et al., 1999: 926); (2) passive post-modifiers, as they function adjectivally rather than as core predicates.

Manually identifying all sub-corpora

Through manual annotation, we aimed to identify all possible word combinations that contribute to be-passives. Details are shown in Table 2.

Table 2 Structures of be-passives.

Table 2 shows all the possible structures that formed passives in this research, with examples cited from the corpora. The first one is the basic structure of passives, and all the others are variations where some words are inserted between the copular be and the participle -ed. With these structures, we hope to exhaust the passives needed for the research.

Extracting passives with corpus tools

Extracting the passives depends on the tagging of parts of speech (POS) of all the words in the corpora. TreeTagger accomplished this task automatically. After all the POS were tagged, we used Regex (regular expressions) to extract the passives.

According to the structures listed in Table 2, we used the following regular expressions in Antconc (Table 3).

Table 3 The regular expressions for passives in this research.

Table 3 shows all the regular expressions used to extract the passive structures. Each regular expression in the first column was combined with each in the second and with the one in the last. In this way, we extracted different passive structures. For instance, an entry of the combination [A-z]+_VBD [A-z]+_RB [A-z]+_VVN, yielded the following results (see Fig. 1).

Fig. 1
Fig. 1
Full size image

A screenshot of a few search results.

However, the above regular expressions would result in some unwanted outcomes. When [A-z]+_DT was inserted in between, we could get “As this is an ordered outcome,…” (SS-2010). Although the word ordered was tagged as a past participle, it was used as a pre-modifier and was, therefore, deleted. When [A-z]+_VBG in the first column was used, we got some overlapped results, which we also deleted.

Manual validation

While corpus tools provided precise extraction of passive constructions, manual validation was necessary to exclude ambiguous cases, particularly formulaic prepositional passives (Schwarz, 2019). To ensure objectivity, both authors independently reviewed all identified instances, achieving near-perfect inter-rater reliability (Cohen’s κ = 0.94). Discrepancies in classification (e.g., whether be composed of denotes an action or a state) were resolved through iterative discussions, supplemented by consultations with dictionaries and native English speakers.

This rigorous process led to the exclusion of seven high-frequency prepositional passives: be based on, be concerned with, be related to, be composed of, be correlated with, be associated with, and be made up of. Two criteria justified their removal:

  1. 1.

    Functional mismatch: These constructions predominantly describe inherent states or material relationships (e.g., the sample was composed of) rather than deliberate methodological actions by authors (e.g., the data were analyzed).

  2. 2.

    Formulaic bias: Their conventionalized usage in academic writing (e.g., the framework is based on) risks artificially inflating passive voice frequencies, thereby distorting diachronic comparisons between PV trends.

Classifying passive verbs

The above steps ensured that we obtained the intended PV data, which we then categorized into five types—procedural, reporting, relational, evaluative, and causative verbs—based primarily on Systemic Functional Grammar (SFG) (Halliday, 1994) and on SFG-informed local grammar (Hunston & Sinclair, 2000). Procedural verbs, aligning with material processes (Halliday & Matthiessen, 2014), describe processes, methods, or actions taken to carry out the research (e.g. select, collect, transcribe). Reporting verbs reference or describe previous studies or findings (Hyland, 1999) (e.g. report, indicate). Relational verbs describe relationships, comparisons, or connections between research elements (Halliday & Matthiessen, 2014) (e.g. compare, involve, represent, match). Evaluative verbs express evaluation, judgment, or justification (Hunston & Sinclair, 2000) (e.g. assess, evaluate, regard). Causative verbs express causation, enablement, or influence (Halliday & Matthiessen, 2014) (e.g. allow, require, affect).

To ensure the robustness of our categorization, we, two linguistically trained coders, independently annotated all instances, achieving a Cohen’s κ of 0.84—a reliability coefficient indicating excellent agreement beyond chance (McHugh, 2012). Discrepancies, primarily arising from ambiguous functional overlaps (e.g., causative-evaluative or procedural-causative boundaries), were resolved through iterative consensus discussions anchored in SFG principles, followed by third-party adjudication for persistent edge cases.

Statistical tests

After data extraction, we normalized the frequency of PV constructions as the number of occurrences per 1000 sentences (PV frequency = [PV count / total sentences] × 1000). These frequencies were computed separately for each period (1980, 1990, 2000, 2010, 2020), academic branch (the natural sciences, the social sciences, the humanities), and passive types (procedural, reporting, relational, evaluative, and causative). The normalized data were subsequently used for Spearman correlation tests and Multivariate regression analysis. All statistical analyses were conducted using IBM SPSS Statistics (Version 26). Significance levels were set at p < 0.05 for all tests. Prior to analysis, we verified that our data met the necessary assumptions for each statistical test, including independence of observations and absence of multicollinearity for the regression analysis.

Spearman correlation tests

To quantify trajectories of PV usage over time, we employed Spearman’s rank correlation coefficient, a non-parametric test robust to non-normality and outliers (Schober et al., 2018). This approach aligns with the ordinal nature of time periods and accommodates potential non-linear trends in linguistic change (Gries, 2021). For each academic branch, we correlated passive structure frequency (normalized per 1000 sentences) with time to identify branch-specific trends.

Pearson correlation tests and multivariate regression analysis

To identify which particular type(s) of passives significantly contributed to these diachronic trends, we conducted a staged analytical approach: 1) Pearson correlation analysis to examine bivariate relationships among the different passive structures, and 2) multivariate regression models were built to isolate the independent effects of each passive type while controlling for inter-correlations. This approach aligns with established practices in diachronic corpus linguistics (Hilpert & Gries, 2016).

In this model, time (represented by the five sampling points: 1980, 1990, 2000, 2010, and 2020) served as the independent variable, while the normalized frequencies of the five passive types were the dependent variables. By implementing the multivariate framework—a recommended procedure in quantitative linguistics for controlling inter-variable interactions (Levshina, 2015)—we obtained regression coefficients that quantified the rate and direction of change for each passive type. These metrics thereby enabled us to determine which types were most responsive to changing writing conventions across the decades.

Results

The diachronic development of passives

The normalized frequencies of the passives and the results of Spearman correlation tests are presented in Table 4.

Table 4 Normalized frequencies and the Spearman test results.

Table 4 reveals an overall decline in the normalized frequencies of PV constructions across three major academic branches over a forty-year period from 1980 to 2020. The table also includes Spearman correlation coefficients that quantify the strength and direction of these temporal trends.

Specifically, in the humanities, PV frequency exhibited a considerable downward trajectory, decreasing from 468.52 in 1980 to 363.69 in 2010, followed by a modest recovery to 383.86 in 2020. The natural sciences demonstrated the highest initial frequency of PV usage (666.67 in 1980), aligning with the traditional scientific writing convention of emphasizing objectivity through passive constructions (Cooray, 1967). While this frequency tumbled to 503.08 by 2000, it subsequently stabilized and showed slight increases in 2010 (523.23) and 2020 (530.26). The social sciences exhibited the most pronounced and statistically robust decline in PV usage over the studied period. The frequency decreased steadily and substantially all the way from 568.90 in 1980 to 347.45 in 2020.

The aforementioned decline trends were further substantiated by Spearman’s correlation test. For the social sciences, the test result indicated an exceptionally strong negative correlation (rs = −0.975) between Year and Frequency. This relationship was statistically significant at the 0.05 level (p = 0.005), demonstrating a robust inverse relationship. The correlation coefficient’s proximity to −1.0 suggested that as years progress, there is a highly consistent decrease in PV frequency. In terms of the humanities, the analysis revealed a strong negative correlation (rs = −0.872) between Year and Frequency. This relationship approached but failed to reach conventional statistical significance (p = 0.054), falling just short of the standard 0.05 threshold. The correlation coefficient suggests a substantial inverse relationship, indicating that frequency tends to decrease as years advance, though with slightly less consistency than observed in the social sciences. Meanwhile, the result for the natural sciences displayed a moderate negative correlation (rs = −0.462) between Year and Frequency. This relationship is not statistically significant (p = 0.434), indicating that the observed negative association could potentially be attributed to chance. The correlation coefficient suggested a weaker temporal pattern compared to the humanities and the social sciences, demonstrating that year progression has a less consistent relationship with frequency changes in the natural sciences.

The contributing passive categories to the diachronic trends

Humanities

Table 5 shows the Pearson correlation between each type of passive and the total for the humanities.

Table 5 The Pearson correlation coefficients in the humanities.

Notably, there was an exceptionally strong correlation between causative passives and the total (r = 0.993, p < 0.001), indicating that causative passives were highly representative of overall linguistic patterns in the humanities. Similarly, reporting passives (r = 0.918, p = 0.014) and relational passives (r = 0.905, p = 0.017) demonstrated robust positive correlations with statistical significance. While evaluative passives showed a moderate positive correlation (r = 0.724), this relationship did not reach statistical significance (p = 0.083). Procedural passives exhibited the weakest correlation (r = 0.052, p = 0.467), suggesting these structures were used independently of overall linguistic density in the humanities. A further multivariate regression analysis (Table 6) helped determine which type(s) of passives significantly contributed to the total.

Table 6 Summary of multivariate regression analysis for the humanities.

In Table 6, the regression analysis yielded a single-predictor model with causative as the only significant predictor of the overall diachronic trend. The overall model was statistically significant, where F (1, 3) = 223.480, p = 0.001, and explained 98.7% of the variance in the total (R2 = 0.987, Adjusted R2 = 0.982). The causative passives strongly predicted the overall trajectory of PVs (β = 0.993, p = 0.001), with each unit increase in causative passives associated with a 7.354-unit increase in the total.

The other passives (procedural, reporting, relational, and evaluative) were excluded from the final model as they did not significantly improve prediction beyond the contribution of the causative passive (all p > 0.05). The regression diagnostics indicated that the model met the assumptions of regression analysis. The Durbin-Watson statistic (1.322) suggested an acceptable level of independence of residuals. Examination of residual statistics showed normally distributed residuals with standardized values ranging from −1.483 to 0.756, supporting the appropriateness of the model.

In summary, the regression analysis identified the causative passive as the primary predictor of PVs in the humanities, accounting for nearly all of the explainable variance. Although other passives showed strong bivariate correlations with the total, they did not contribute unique predictive value beyond that provided by the causative passives.

Natural sciences

Table 7 shows the Pearson correlation between each type of passive and the total for the natural sciences.

Table 7 The Pearson correlation coefficients in the natural sciences.

According to the statistics in Table 7, relational passives showed the strongest positive correlation with the total (r = 0.954, p = 0.006), followed closely by reporting passives (r = 0.897, p = 0.020) and procedural passives (r = 0.875, p = 0.026), all statistically significant. Interestingly, evaluative passives show only a moderate positive correlation (r = 0.449) without statistical significance (p = 0.224). Most notably, causative passives display a moderate negative correlation (r = −0.483, p = 0.205), suggesting that as overall linguistic density increases in the natural sciences texts, the use of causative structures tends to decrease, though this trend was not statistically significant. A further multivariate regression analysis (Table 8) helped determine which type(s) of passives significantly contributed to the total.

Table 8 Summary of multivariate regression analysis for the natural sciences.

The results of the regression analysis confirmed the contributing role of relational passives. Table 8 shows a one-predictor model with relational passives as the only significant predictor of the overall trend, F(1, 3) = 30.525, p = 0.012. The model explained 91.1% of the variance (R2 = 0.911, Adjusted R2 = 0.881), demonstrating strong predictive power. The regression equation (Total = −30.599 + 8.892 (Relational)) indicated that each unit increase in relational passives was associated with an 8.892-unit increase in total, with the relational passives showing a strong positive effect (β = 0.954, t = 5.525, p = 0.012). The Durbin-Watson value of 2.651 and examination of residuals confirmed that the regression assumptions were adequately met. Other passives (procedural, reporting, evaluative, and causative) were excluded from the final model as they did not contribute significant additional predictive value beyond the relational passive, despite some showing significant bivariate correlations with the total.

In summary, the regression analysis demonstrated that the relational passives were the primary predictor of total in the natural sciences, accounting for a predominant proportion (91.1%) of the variance. Although other passives (particularly procedural and reporting) showed strong bivariate correlations with the total, they did not contribute unique predictive value beyond that provided by the relational passives, likely due to shared variance among these predictors.

Social sciences

Table 9 shows the Pearson correlation between each type of passive and the total for the natural sciences.

Table 9 The Pearson correlation coefficients in the social sciences.

In Table 9, the total was most strongly correlated with the procedural passive (r = 0.989, p < 0.01), followed by the reporting passive (r = 0.946, p < 0.01) and the relational passive (r = 0.845, p < 0.05). The correlation with the evaluative passive approached significance (r = 0.795, p = 0.054), while the causative passive showed a non-significant relationship with the total (r = 0.317, p = 0.302). A further multivariate regression analysis (Table 10) helped determine which type(s) of passives significantly contributed to the total.

Table 10 Summary of multivariate regression analysis for the social sciences.

In Table 10, the results of the regression analysis revealed two significant models. Model 1, with procedural passives as the sole predictor, demonstrated strong predictive power (R = 0.989, R2 = 0.979, Adjusted R2 = 0.971) and statistical significance (F(1, 3) = 137.125, p = 0.001). This model revealed that procedural passives were a significant predictor (β = 0.989, t = 11.710, p = 0.001), with the regression equation total = 13.887 + 1.438 (Procedural).

However, Model 2, which added causative passives as a second predictor, achieved a perfect fit (R = 1.000, R2 = 1.000, Adjusted R2 = 1.000) with significantly improved precision (standard error reduced from 17.33 to 1.93). This model was highly significant (F(2, 2) = 5658.006, p < 0.001), with both procedural (β = 0.963, t = 100.898, p < 0.001) and causative (β = 0.148, t = 15.502, p = 0.004) making unique, significant contributions. The final regression equation was total = 4.672 + 1.401 (procedural) + 2.093 (causative), with appropriate independence of observations confirmed by the Durbin-Watson statistic (2.082). These findings indicated that while procedural knowledge accounts for most of the variance in the social sciences performance, causal reasoning provides a smaller but crucial contribution to the predictive model.

Discussions and suggestions

Discussions

PV change and its relevance to manuscript acceptance

The documented decline in PV usage across academic disciplines in recent decades (Mair & Leech, 2006; Li, 2022) aligns in part with this study’s method-section-specific findings, particularly within social sciences (rs = −0.975) and the humanities (rs = −0.872). These trajectories suggest that since the 1980s—when the Plain English Campaign and broader calls for clarity began to resonate in academia—disciplinary voice preferences have undergone gradual but uneven reorientation. There is, however, a more nuanced trend in the natural sciences, where post-2000 PV stabilization (523.23 → 530.26) mirrors Leong’s (2014) prediction of a 30% passive voice equilibrium in scientific texts. Rather than a uniform shift toward informality (Hyland & Jiang, 2017), this pattern points to a discipline-specific recalibration of rhetorical norms, and these divergent curves indicate discipline-specific recalibrations and compliance pressure occurred at different historical moments.

Disciplinary paradigms have themselves shifted over time. In the humanities and social sciences, this shift has been particularly evident since the 1970s–1980s, when heated methodological debates and the seminal works of Morgan (1983) and Lincoln & Guba (1985) helped to institutionalize issues of trustworthiness, methodological transparency, and pluralism in qualitative research; subsequently, the 1990s witnessed the textbook formalization and widespread adoption of mixed-methods approaches (Creswell, 1998). Against this backdrop, humanities and social sciences have gradually evolved towards greater authorial visibility and credibility (Hyland, 2001), as their Methods/Methodology section increasingly requires researchers to rigorously justify their choice of research tools, data sources, and analytical frameworks (Baratta, 2009), as well as motivation and assessment (Seoane & Hundt, 2017). This evolving paradigm of writing naturally entails more presence of the “author as agent”, allowing readers to clearly follow the researcher’s line of reasoning and enabling more effective communication. These cumulative changes from the 1990s onward help explain the pronounced PV decline in these fields. Natural sciences, however, have the inertia to understand the Methods section as a space for procedural description, without superfluous methodological rationale already familiar to peers in the profession (Hyland, 2002). This paradigm, resistant to rhetorical personalization, explains the stabilization of PV usage even as other fields shifted more decisively toward AV.

Compliance pressures, whereby manuscripts that deviate significantly from the guidelines risk rejection and revision requests in the peer review process (Feld et al., 2024; Belcher, 2019), have also evolved over time. Whereas earlier style manuals permitted broader use of PV, more recent editions have tightened their stance for achieving better conciseness, readability, and vigor. For instance, the Publication Manual of the American Psychological Association (APA) style—mandated by many humanities journals and by over half of social sciences journals (Santos et al., 2023; Vizváry and Grigas, 2025)—did not articulate an explicit preference regarding passive voice in its pre-1980 second edition (APA, 1974), but introduced such guidance in its post-1980 third edition (APA, 1983). This is particularly the case where its sixth edition recommended “using the active rather than the passive voice” (APA, 2009: 34) and the 7th edition made this prescriptive: “Use the active voice as much as possible” (APA, 2020:42). This strengthening aligns with sharper PV declines in social-science and humanities journals after 1980. In natural sciences, Nature’s author instructions, for example, began to explicitly state: “Nature journals prefer authors to write in the active voice” (Nature, n.d.). A Nature editorial further acknowledged that scientific language was “becoming more informal and direct” (Nature, 2016), confirming that actual publishing practice was catching up with prescriptive guidance. However, some style guides still accept, prefer (Gastel & Day, 2022), and even conventionalize (Hofmann, 2016) PV usage, particularly in the Methods section. These evolving and somewhat contradictory guides for writing Methods may confuse academic writers, especially those lacking access to the latest versions, and help explain why natural sciences exhibit greater resistance to change. These shifts in official policies and editorial culture coincide with the observed discipline-specific PV trajectories: earlier and sharper declines in the humanities and social sciences versus later stabilization in the natural sciences.

Taken together, these observations suggest that PV decline reflects not only cross-sectional stylistic differences but also the temporal evolution of paradigmatic orientations and compliance pressures. Importantly, the turning points in prescriptive discourse—APA’s strengthened active-voice mandates (APA, 2009, 2020) and Nature’s active-voice preference by the 2000s (Nature, 2016, n.d.)—closely align with our observed disciplinary PV trends in the Method section. Compliance pressures have therefore not been static but have evolved alongside disciplinary paradigms, shaping both acceptance practices and stylistic norms.

Disciplinary divergences for categories of passive verbs

Our findings expose striking disciplinary divergences. In the humanities, the decline in causative passive usage—which accounts for 98.7% of the variance in humanities PV trajectories (β = 0.993, p = 0.001)—primarily drives the overall reduction in PV. This trend indicates a strategic shift toward the AV in explaining causality and describing influence or enablement, for instance, from “The… was designed as …” (HU-1990) to “Researchers designed …” (HU-2010). This shift may stem from broader social developments such as the rise of the academic accountability system (Winker et al., 2023) in recent decades, which promotes clearer identification of agents. For example, using the active voice (“Funding constraints affected our sampling”) more clearly demonstrates causality than the passive voice (“Our sampling was affected by funding constraints”), as it explicitly identifies the influencing factor. It may also relate to the growing influence of the knowledge economy and personal branding (Kucharska, 2023), where first-person statements like “we conclude” allow researchers to claim ownership and highlight their contributions. Based on this shift, instructors of humanities, for example, jurisprudence, may encourage academic writers to trim causative passives by using more first-person pronouns, particularly in such countries as China, where plain English should feature more prominently (Lin et al. (2023)). To achieve this, they may provide examples contrasting passive and active causative constructions, and design exercises that help students practice rephrasing causative passive sentences into clearer, more personal-branding-oriented AV alternatives. Such pedagogical efforts could enhance clarity and persuasiveness, addressing the frequent challenge of overly formal or impersonal phrasing.

In the social sciences, the decline in PV is driven by two passive categories: procedural and causative. The underlying factors behind the decline in causative passives may be consistent with those previously observed in the humanities. For procedural passives, a potential factor is the rise of participatory scholarly identity. The use of active constructions like “we” reflects not only this identity shift but also a broader methodological move from positivist to constructivist paradigms, emphasizing researcher-participant and writer-reader interaction (Harwood, 2005) throughout the procedural stages of experimental or empirical studies. The recent rise of digital scholar (Weller, 2011) or networked participatory (Mu et al., 2018) identity may have further driven the trend of procedural passives. We conjecture that, driven by Open Science and Responsible Research and Innovation (RRI) (Liu et al., 2022), this shift is likely to persist in the foreseeable future. Building on this trend, social sciences instructors may encourage academic writers—especially those from contexts less exposed to plain English—to confidently adopt procedural and causative actives. Instructors can cite authentic published examples such as “First, we…; second, we…; third, we…” and “we clarify…” rather than relying rigidly on locally published textbooks. Such instruction not only enhances rhetorical clarity and coherence but also helps novice writers internalize genre-appropriate authorial positioning in empirical research writing.

In the natural sciences, relational passives emerge as the key predictor of PV trajectories (β = 0.954, p = 0.012), even though procedural passives remain numerically dominant. This statistical primacy explains the field’s unique PV pattern—an initial sharp decline followed by post-2000 stabilization. The potential reason may be derived from their essential role in cross-material comparisons, resulting in an equilibrium. Notably, the overall insignificance of the PV trend (rs = −0.462, p = 0.434) underscores a critical tension: although relational passives drive directional changes, their partial retention, alongside stable procedural passive usage. This may, undoubtedly, result from the discipline’s effort to balance between evolving stylistic preferences and core epistemological needs. It can also arise from a lack of (Gupta et al., 2022) or insufficient (Dean et al., 2015) systematic discipline-specific writing training; thus, many natural sciences researchers resort to expert modeling (Yang et al., 2019) or some even “googling+imitation” (Li, 2013), leading to technical parroting. If, therefore, conditions permit, academic writing instructors in the natural sciences may encourage students to critically examine contexts where relational passives serve key epistemological functions—particularly in comparing materials—while favoring active voice in more agentive, procedural contexts. Such nuanced instruction can not only avoid technical parroting but also foster stylistic awareness without undermining the communicative precision required in scientific reporting.

The disciplinary-specific decline in PV usage underscores the need for tailored pedagogical strategies that balance evolving stylistic norms with disciplinary epistemological demands.

Recommendations for discipline-specific PV usage in academic writing

To equip academic writers with adaptive skills, we propose three recommendations, each addressing critical gaps in current practices.

Consulting journal-specific or publisher-specific style guides

Despite the general decline in PV, adherence to specific guidelines is critical for academic writers navigating divergent disciplinary norms. For instance, while Bioresource Technology follows Elsevier’s (n.d.) guidelines against PV, BioResources, a journal of similar aims and scopes published under the auspices of North Carolina State University, recommends the use of PV (BioResources, n.d.). Interestingly, Biofuels, Bioproducts and Biorefining, a journal also in this profession, follows Wiley-Blackwell House Style guide (2007), which allows authors to choose their preferred voice, serving as a compromise in the AV-vs-PV dilemma. These divergent policies reflect a broader negotiation between evolving stylistic trends (e.g., AV for clarity) and entrenched disciplinary conventions (e.g., PV for objectivity). By mandating compliance, journals implicitly train authors to dynamically reconcile these forces—preserving traditional methodological rigor while adopting communicative innovations. Such meticulous compliance has proven effective: studies indicate that explicit AV guidelines reduce PV usage by 8.07% (Leong, 2014), with recent analyses reporting even higher reductions of 11% (Fauziah & Bashtomi, 2024). By echoing syntactic choices with target journal policies, writers ensure compliance with editorial standards, a prerequisite for manuscript acceptance (Fauziah & Bashtomi, 2024).

Prioritizing discipline-sensitive writing education

Given nuanced cross-disciplinary differences as empirically evidenced by Dong et al. (2024), research programs should offer discipline-sensitive writing education, instead of generic writing courses. For instance, social sciences benefit from the AV training to accentuate researcher agency, aligning with Hyland’s (2001) authorial visibility framework. Conversely, natural sciences require modules preserving relational passives, while some scientists need to get educated against the overuse of the PV. This pedagogical differentiation actively balances stylistic evolution with disciplinary integrity: it safeguards field-specific conventions while accommodating the growing preference for authorial visibility through targeted AV adoption.

We further recommend a collaborative teaching model for academic writing courses, where experienced, discipline-savvy writers co-design and co-teach with a linguist familiar with the subject matter. This approach, recently proven to be effective (Yan et al., 2024), enables tactical voice adjustments. For example, a teacher of engineering paper writing may collaborate with an engineering-informed linguist to convert some non-core relational or procedural passives in the Methods section into the active. This will retain PV for apparatus-centric steps (e.g., “the alloy was cooled”) to uphold methodological rigor, but using AV for human interventions (e.g., “we compared the specimens”). Such calibrated conversions can improve readability, reduce syntactic complexity, and clarify authorial contributions. Overall, this strategy enables writers to meet editorial standards for clarity without compromising disciplinary conventions.

Using hands-on authentic case materials

When dissecting passive voice trade-offs, instructors should utilize hands-on authentic case materials, as they can enhance engagement (Toogood, 2023), aid in knowledge building (Kim et al., 2006) on the PV, and foster genre awareness (Hyland, 2019). Critically, these materials serve as concrete negotiation spaces where writers learn to reconcile stylistic evolution (e.g., clarity-driven AV adoption) with discipline-specific rhetorical needs (e.g., PV for procedural focus). For instance, teachers who are also manuscript polishers may, under non-infringement, use recent bi-version materials to illustrate how strategically converting non-essential procedural passives (e.g. To investigate the acoustic properties…, …were carried out) into the active (To investigate the acoustic properties…, we conducted…, Liu et al., 2022) helps eliminate dangling modifiers, instead of advocating uniform use of procedural passives. This targeted approach avoids blanket prescriptions, instead training writers to make context-sensitive choices that honor both communicative clarity and methodological conventions. If such materials are unavailable, instructors may resort to student researchers’ draft manuscripts (Miró-Colmenárez et al., 2025), guiding learners to identify when PV sustains disciplinary integrity versus where AV advances contemporary readability metrics, including voice use as experimentally found by Millar & Budgell (2019).

To sum up, the choice between PV and AV in the Methods/Methodology section of academic writing represents a dynamic negotiation between evolving stylistic norms and entrenched disciplinary conventions. Our three recommendations emphasize that writers must proactively reconcile two competing forces through metalinguistic awareness: the emerging clarity-driven paradigm and established disciplinary methodologies. Such adaptation transcends technical rhetoric, constituting a cognitive process of engaging with disciplinary discourse communities.

Conclusion

In this study, we conducted a diachronic analysis of passive voice (PV) usage in the Method(s)/Methodology sections of high-impact SSCI/SCI journals from the natural sciences, social sciences, and humanities over a 40-year period (1980–2020). Overall, the humanities and social sciences exhibited a pronounced decline in PV usage, reflecting a shift toward a more active, direct style that enhances reader engagement and authorial presence. In contrast, the natural sciences demonstrated a sharp drop in PV frequency from 1980 to 2000, followed by stabilization from 2010 onward, suggesting a discipline-specific balance between objectivity and clarity in methodological descriptions. Multivariate regression analyses revealed that in the humanities, causative passives were the primary drivers of this trend, while in the natural sciences, relational passives emerged as key predictors. In the social sciences, a combination of procedural and causative passives explained the majority of the variance. These results provide robust empirical evidence that academic writing conventions have evolved in a discipline-sensitive manner, highlighting the need for tailored writing strategies that accommodate both stylistic innovation and the functional demands of research reporting.

This study contributes to the field by merging diachronic corpus analysis with a functionally-classified model of passive structures. The findings refine our understanding of PV change in academic discourse and offer practical insights for academic writing instruction. By pinpointing which types of passive constructions drive stylistic shifts, the study supports the development of discipline-sensitive writing courses. These contributions are crucial for enhancing manuscript clarity and improving pedagogical practice in academic writing.

Despite its contributions, the study has several limitations. First, our exclusion of post-1980 journals neglects methodological conventions in emergent fields (e.g. AI, data analytics, synthetic biology, environmental engineering), potentially underrepresenting recent shifts toward agent-oriented descriptions. Future research may (1) sample emergent-field journals to determine whether these disciplines exhibit distinct voice patterns, and (2) perform comparative analyses between legacy and newly launched journals to test if younger titles show a stronger move toward active constructions.

Second, by focusing solely on high-impact SSCI/SCI journals, the study potentially introduces selection bias by excluding non-core, non-English, and open-access publications. This narrow focus confines our analysis to flagship Anglophone, subscription-based outlets, overlooking regional editorial norms, language-specific voice preferences, and interdisciplinary conventions, which may diverge markedly from those in top-tier journals. Future research may expand the corpus to include journals from non-English-speaking contexts, open-access venues, and lower-impact journals to test the universality of these trends.

Moreover, while the study reveals potential underlying factors for trends in PV usage and verb categories across academic branches, it fails to empirically or experimentally delve into authors’ voice choices, for instance, such as their in-the-moment cognitive deliberations, the trade-off between clarity and authorial presence, and how prior training and peer feedback shape their AV/PV selection. Incorporating such methods as interviews and surveys in future studies could provide deeper insights into the attitudes and decision-making processes that shape AV/PV choices.