Introduction

Journal lists play an important role in journal evaluation (Y. Huang et al., 2021), publication reference (Pölönen et al., 2021; Sasvári and Urbanovics, 2019), research assessment (Piazza et al., 2017), and position promotion (Bales et al. 2019). Current journal lists are often created based on expert peer review and bibliometrics (Mikhailenko and Goncharov, 2017; Smith, 2010a, 2010b) (including Web of Science and Scopus, which are also a special category of journal lists). However, a significant number of journals lists usually do not achieve broad consensus (Brezis and Birukou, 2020; Jiang and Liu, 2022b; Stachura et al., 2024). In the process of practical application, some researchers present complex attitudes towards journal lists (Fassin, 2021; Serenko and Bontis, 2024), questioning their inclusiveness (Li et al., 2019), fairness (Grossmann et al., 2019)), interdisciplinarity (Kushkowski and Shrader, 2013), institutional applicability (Beets et al., 2015; Jaafar et al., 2021), scientific validity (Chavarro et al., 2018; Moosa, 2016). They have also reflected on the added effects (Pons-Novell and Tirado-Fabregat, 2010) and the impact (Walker et al., 2019) of journal listings. Related studies have also shown that broader lists of journals can be misleading in scientific assessments (Borrego, 2021; J. Li et al., 2023), and do not do a good job of highlighting assessments of innovative outcomes (Jiang et al., 2025a). Nevertheless, journal lists continue to be created (Wang et al., 2023) and applied (Baskerville, 2008; George, 2019) in the face of different needs in the practice of research evaluation.

At present, the majority of journal lists are published by relevant research organizations or professional associations. Conversely, there is a paucity of journal lists proposed by academic publishers. As a product line and brand extension of the Nature journal series (Khelfaoui and Gingras, 2022), Nature Index journals list (‘Introducing the index’, 2014) have gained attention and been widely used in the fields of global research management and research evaluation (Lin et al., 2015) including but not limited to: national research assessment (Jiao and Yu, 2020; Lin et al., 2015; Silva, 2016), disciplinary evaluation (Lin et al., 2017; Yang et al., 2019), institutional evaluation (Chen, 2018; Y. Liu et al., 2018), journal evaluation (Li and Wang, 2016; Hayatdavoudi et al., 2023), and research model analysis(Cai and Han, 2020; Zhu et al., 2021). Similarly, the evaluation effectiveness (Bornmann and Haunschild, 2017; Haunschild and Bornmann, 2015b), standard scientificity (Campbell and Grayson, 2015; Haunschild and Bornmann, 2015a) and evaluation orientation (Waltman and Traag, 2017) of Nature Index have been questioned to some extent.

So how do we effectively assess the academic quality of journal listings? Traditional scientific assessment relies heavily on the binary system of bibliometrics and peer review, but both have significant limitations. The validity of bibliometric methods of assessment at the individual level has been questioned by academics (Giovanni, 2017), and some scholars have even put forward a critical stance of ‘bibliometric denialism’. Relying exclusively on peer review faces structural problems such as review subjectivity and resource constraints (Ferguson, 2020). NOR-CAM’s theory of hierarchical assessment offers a new perspective: scientometrics is uniquely suited to macro-level analyses at the subject area level, a meso-level need that is difficult to cover by peer-review mechanisms. On the other hand, in terms of evaluation, the selection mechanism of Nature Index has systematically integrated expert review and impact indicators, but lacks the assessment of the innovation level of research papers. Although academic impact and peer recognition can indirectly map the value of research, there is still an essential difference between them and the nature of innovation. This study chooses to bring in the evaluation perspective of disruptive innovation to better assess the academic quality of journal listings.

Since Wu et al. (Wu et al., 2019) published the article Large teams develop and small teams disrupt science and technology and proposed the Disruption index, the evaluation method of disruptive innovation based on citation network has received extensive attention from scholars in various fields. We have proposed the method of calculating Disruption Index based on open citation data in the course of our previous research (Jiang and Liu, 2024b), and have successively carried out the evaluation research at both literature and journal level (Jiang et al., 2023; Jiang et al., 2024a; Jiang and Liu, 2024b; Zixuan et al, 2024c, Jiang et al., 2025a, 2025b; Jiang and Liu, 2022a; Jiang, 2023b; Jiang et., 2024a, Jiang and Liu, 2024b; Zixuan et al., 2024c; Liu X, Jiang Y 2025) and the practice of domain analysis (Jiang and Liu, 2023a), so that the evaluation of disruptive innovation at the relevant level has gradually become mature.

Considering the realistic role of journal lists in the present time, this study proposes to use Nature Index as a research object to conduct a scientometrics study by linking multiple data sources. The study will explore the coverage of research topics in Nature Index, assess the level of innovation of Nature Index journal articles in different topics, and analyze the research topics not yet covered by Nature Index journal articles, so as to select high-quality journals in these research topics. The results of the study will serve as a valuable reference for the Nature Index team, as well as for relevant management departments and research institutes in their academic evaluation work.

Method

Research object

In this study, 145 journals based on the latest Nature Index collection as well as all other journal papers collected by OpenAlex in the same year were selected for the study. The specific list can be obtained from https://www.nature.com/nature-index/faq. All citation data are subject to the common restrictions of the OpenAlex and COCI databases, with the latest year ending at the end of 2023. Specifically, a total of 92708 research articles from Nature Index journals and 1553210 research articles from non-Nature Index journals were included. For each article, OpenAlex classified it under multiple topics, but assigned a major topic to each article based on the confidence level of the classification. The disciplinary classification of research articles in this paper is based on this sole major theme only.

Data sources

The data to be acquired for this study include journal literature information and citation relationship information. In the actual research process, they were obtained through OpenAlex and COCI, respectively.

OpenAlex is an academic infrastructure built by OurResearch. It provides the academic community with alternatives similar to Scopus and Web of Science, but free and open (Scheidsteger and Haunschild, 2023), with the possibility of using APIs for data access (Harder, 2024) and also has a mature utility toolkit, openalexR (Aria et al., 2024). OpenAlex is now widely used in the field of scientometrics (Foderaro and Gunnarsson Lorentzen, 2024; Okamura, 2023; Ortega and Delgado-Quirós, 2024; Sued, 2024; Tu Le et al., 2024; Xu et al., 2024; Yan et al., 2024). Compared with similar paid services, OpenAlex has significant advantages in terms of inclusiveness, affordability and usability.

COCI, the Crossref Open DOI to DOI Citation OpenCitations Index (Heibi et al., 2019) is the largest dataset ever released successively and the only citation dataset currently integrated. It contains over 116 million bibliographic resources and over 2,013 million citation links (as of July 2024), and aims to provide a disruptive alternative to traditional proprietary citation indexes (Peroni and Shotton, 2020). As a public digital infrastructure OpenCitations has become one of the primary sources of scholarly data for publishers, authors, librarians, funders, and researchers (Hendricks et al., 2020). The metadata it contains is expanding at an average rate of 11% per year, and the functionality and support of the APIs it provides have been enhanced and expanded (Lammey, 2016), and are likewise widely used in real-world research (Bologna et al., 2022; Borrego et al,. 2023; Heibi and Peroni, 2022; Jiang and Liu, 2023a; Spinaci et al., 2022; Y. Zhu et al., 2020). The latest version of the COCI dataset now combines PIDs such as DOIs and PMIDs into the OpenCitations Meta Indentifier (OMID), which enhances the inclusiveness of different citation data sources (Massari et al., 2024). This initiative also lays a solid foundation for linking the use of both OpenAlex and COCI data sources in this study.

Despite the problems of OpenAlex’s metadata and COCI’s citation linking data at the institutional (L. Zhang et al. 2024), linguistic (Céspedes et al., 2025) and coverage (Martín-Martín et al., 2021; Ortega and Delgado-Quirós, 2024) levels, they are still the best available options for scientometrics researchers to conduct large-scale studies when combining multiple dimensions of accessibility, availability, and quality problems.

Data processing

Since this study connects multiple data sources used to conduct a larger data scale scientometrics study, in this section we will detail the specific data processing and usage process (as shown in Fig. 1):

  1. (1)

    Based on the API provided by OpenAlex and the pyalex library on GitHub (https://github.com/J535D165/pyalex), we used python scripts to obtain metadata from OpenAlex for all research papers published in journals in 2020.

  2. (2)

    Get the latest list of Nature Index journals from the official Nature Index website.

  3. (3)

    Download all the dump data (PID-OMID relationship dataset and OMID-OMID citation relationship dataset) provided by OpenCitations from figshare.

  4. (4)

    Import these three datasets into the local SQLite3 database to complete the data acquisition work.

  5. (5)

    After completing the data import work, based on the realistic needs of this study, we slice the OMID-OMID citation relationship dataset, and add indexes to all the datasets reasonably, in order to improve the efficiency of data use.

  6. (6)

    Based on the PID-OMID relationship data table in the local database, the corresponding OMID data of the literature obtained from OpenAlex were extracted (papers that could not obtain the corresponding OMIDs were excluded from the subsequent study).

  7. (7)

    Based on the extracted OMID data table, calculate the impact and disruptive innovation level of the selected research papers respectively. The method of calculating the disruptive innovation level of papers based on COCI dataset can be referred to the article A new method of calculating the disruption index based on open citation data published in Journal of Information Science (Jiang and Liu X, 2024b)

  8. (8)

    Based on the obtained calculations and the Nature Index journals list, a multi-level data analysis was carried out.

Fig. 1
figure 1

Detailed data processing and usage flow in this study.

Evaluation indicators

The evaluation indicators involved in this study include two main categories: disruptive innovation indicators and impact indicators which are used to comprehensively assess the academic quality of journal lists. We believe that if research papers covered by journal lists are ranked better than those not covered by journal lists in terms of both academic impact and level of disruptive innovation, this will, to a certain extent, reflect the academic quality of journal lists.

In the actual study, the disruptive innovation indicator uses the absolute disruptive index Dz (as in Eq. 1) and the impact indicator uses the cumulative citation frequency of the paper. Both of these two indexes are used in calculating the average/median impact ranking and the ranking of disruptive innovation of a topic. Since the calculation process of the disruption index is more complicated and the amount of data is large (Song et al., 2022), and we do not have access to massive commercial citation data resources, we used open citation data for the calculation to struck a balance between the time window and the computational workload (uniformly choosing the time window of 2020–2023). Considering that the focus papers included in this study are all from the same year, there is no problem of large time window differences between papers.

The reason for using the absolute disruptive index Dz rather than the original D index in this study is that related studies have confirmed that the original D index has limited effectiveness in evaluating disruptive innovations (Bornmann et al., 2020), and that the absolute disruptive index outperforms the original D index (Jiang and Liu, 2024), and solves the problem of the existence of inconsistency (X. Liu et al., 2020). Therefore, although the optimal variant of the D index is still being explored by the scientometrics community, the Dz index is still the better choice at present.

$${D}_{Z}=\frac{2{{N}_{F}}^{2}}{2{N}_{F}+2{N}_{B}+{N}_{R}}$$
(1)

In Eq. 1NF refers to papers that cite only the focus paper. NB refers to papers that cite both the focus paper and references of the focus paper. NR refers to papers that cite only references of the focus paper without citing the focus paper.

Result

Assessment of topic coverage in Nature Index journals

In order to better assess the topic coverage of Nature Index journals, the research topics coverage extent of Nature Index journals was comparatively analyzed at different levels in this study, and the results are shown in Figs. 2, 3. From them we can find:

Fig. 2
figure 2

Extent to which Nature Index journals cover different research topics in different domains.

Fig. 3
figure 3

Extent to which Nature Index journals cover research topics in different disciplinary categories.

From Fig. 2, we can find: Nature Index journals cover 2217 research topics (73.19%) of the 3029 research topics in scientific fields defined by OpenAlex. In the fields of health sciences, physical sciences, and life sciences, Nature Index journals covered 91.58%, 60.21%, and 81.11% of the research topics, respectively.

From Fig. 3, we can see: The coverage of Nature Index journals varies across the 20 specific disciplinary categories. In the field of chemical engineering, Nature Index journals have the greatest coverage, reaching 100%. The lowest coverage is found in the field of computer science, reaching only 33.11%.

Assessment of academic quality of Nature Index journals in different topics

In order to better assess the academic quality of Nature Index journals list, this study conducted a comparative analysis of the level of academic impact and the level of disruptive innovation of papers published in Nature Index journals at the level of research topics. Considering that topics with a limited number of published papers may have an impact on the results of the analysis (Li, 2025), this section only analyses topics with at least 10 published research papers (>1% of the Nature Index papers included in this research). In addition, calculating the mean and median may not provide a better comparison of the difference in level between research papers in Nature Index and non-Nature Index journals due to the fact that most of the low-quality papers have a level of disruptive innovation of zero. In our actual study we chose the mean and median rankings of absolute disruption index and cumulative citation frequency of research papers to assess the academic quality of Nature Index journals in different topics. The results are shown in Figs. 4, 5, and Tables 14.

Fig. 4
figure 4

Average rankings of academic impact and disruptive innovation for Nature Index journal papers in different research topics.

Fig. 5
figure 5

Median rankings of academic impact and disruptive innovation for Nature Index journal papers in different research topics.

Table 1 Average rankings of academic impact and disruptive innovation of Nature Index journal papers in different research topics.
Table 2 Median rankings of academic impact and disruptive innovation of Nature Index journal papers in different research topics.
Table 3 Number and percentage of dominant topics of Nature Index journals in different disciplinary categories (based on average rankings).
Table 4 Number and percentage of dominant topics of Nature Index journals in different disciplinary categories (based on median rankings).

In Table 1, we can find: (1) in terms of average impact level, Nature Index journal papers are higher than non-Nature Index journal papers in 1124 (99.73%) research topics; (2) in terms of average disruptive innovation level, Nature Index journal papers are higher than non-Nature Index journal papers in 1010 (89.62%) research topics; (3) in terms of both average impact and disruptive innovation level, Nature Index journals were higher/lower than non-Nature Index journal papers in 1009 (89.53%)/2 (0.18%) research topics.

In Table 2, we can find: (1) in terms of median impact level, Nature Index journal papers are higher than non-Nature Index journal papers in 1124 (99.73%) research topics; (2) in terms of median disruptive innovation level, Nature Index journal papers are higher than non-Nature Index journal papers in 1038 (92.10%) research topics; (3) in terms of both median impact and disruptive innovation level, Nature Index journals were higher/lower than non-Nature Index journal papers in 1038 (92.10%)/3 (0.27%) research topics.

From Table 3 and Table 4, we can find: Dentistry and Veterinary are not the main research disciplines that Nature Index journals currently focus on Nature Index journals exhibit academic quality strengths in all 18 research disciplines outside of Dentistry and Veterinary. Nature Index journals’ average proportion of dominant research topics across disciplines is around 90%. But Nature Index journals have an advantage in terms of academic impact compared to the level of disruptive innovation.

An assessment of high-quality journals in areas not covered by Nature Index journals

In order to provide certain references and suggestions to relevant research institutions and the management team of Nature Index, this study conducted an in-depth analysis based on 812 research topics not covered by Nature Index to explore the average academic impact and disruptive innovation level of relevant journals in these research topics. On this basis, reference to the Web of Science collection and the latest JCI partition as a reference for the compliance and long-term level of journals was used to obtain the relevant results of the selection results as shown in Table 5.

Table 5 Selection of journals.

Discussion

Nature Index journals list is quite complete, but there is still room for adjustment

From the results of the above study, we can find that although Nature Index covers most of the research topics, the degree of coverage in specific disciplinary categories shows large differences. In some disciplines, Nature Index journals cover only 50 to 60% of the research topics. Therefore, for research institutions and researchers focusing on these research themes, the use of Nature Index for S&T evaluation work may not be comprehensive enough.

Nature Index journals have a high academic level in more covered research topics, but focus more on academic impact

Nature Index journals have a high level of academic excellence in more of the research topics covered, both in terms of academic impact and academic innovation. When broken down, Nature Index journals better represent high-impact research. In a previous study of mega journals, we also found that due to the country bias in the peer review process (Thelwall et al., 2021), specific journals have also developed a unique preference for the country of origin of their authors (Zhu, 2021)).Considering the fact that due to the pre-eminence of developed countries in the field of science and technology, they have a pool of senior scholars in various fields, leading to the fact that the results of senior scholars are often recognized beyond their own quality in contemporary peer review and other processes (Kardam et al., 2016). Similarly in a study of virology papers we found that expert peer review was more likely to identify high-impact papers than highly innovative papers (Jiang and Liu, 2023c).

Increasing the number of representative journals in uncovered research areas will enhance the application value of Nature Index in academic evaluation

In this study, two journals were selected as additions to the latest Nature Index journals list based on the level of disruptive innovation as the core indicator, and with reference of the academic influence and long-term development of the journals in the relevant research fields. The addition of these two journals may enhance the rationality and comprehensiveness of the Nature Index list to a certain extent.

Conclusion

Based on the OpenAlex and COCI databases, this study evaluates the coverage of research topics in Nature Index journals list, and the level of disruptive innovation of Nature Index journal research articles in the research topics covered. In addition, the study also analyzed a few research topics not covered by Nature Index journals, and mined representative journals, which can provide reference for Nature Index and other related research institutions in optimizing journal lists and subsequent assessment models. The results of the study show that: (1) the Nature Index journals list needs to be further improved and adjusted. (2) Nature Index journals have higher academic standards in more research topics covered, but focus more on academic impact. (3) Increasing the number of representative journals in uncovered research areas will enhance the application value of Nature Index in academic evaluation.

Considering that this study involves the processing of massive amounts of data, open bibliographic databases and open citation databases were chosen to be used in practical applications to carry out this study. Although the methodological validity of disruptive innovation evaluation based on open citation data has been verified in the course of previous research (Jiang and Liu, 2024), the quality of open bibliographic databases needs to be further improved to address the loss of information obtained from integrating different sources (Cioffi et al., 2022; Delgado-Quirós and Ortega, 2024), the promotion of data connectivity in different dimensions (Jiao et al., 2023), and the opening of data at the national level (Moretti et al., 2024) would be effective in improving the robustness of research evaluation based on open citation data (C.-K. (Karl) Huang et al., 2020). In fact, these problems also exist to some extent in commercial citation databases such as Web of Science and Scopus (Chinchilla-Rodríguez et al., 2024; Kramer and de Jonge, 2022; Samanta and Rath, 2023), and the desire to further improve the quality of the data awaits the joint efforts of bibliographers and providers of relevant data sources.

In addition, since the Nature Index did not cover the humanities and social sciences in its original value setting, we did not include the humanities and social sciences related fields in OpenAlex in our study. Therefore, the findings of this study are not directly applicable to the humanities and social sciences.