replying to Y. Wang et al. Communications Biology https://doi.org/10.1038/s42003-025-09324-w (2025)

Rhaphidophoridae, commonly known as cave crickets, is a wingless and ancient lineage of Tettigoniidea. Kim et al.1 (= KIM24) studied the phylogeny and biogeography of Rhaphidophoridae using a multi-gene dataset with a partition model for their analysis. Wang et al. (= WEA) reanalyzed the dataset of KIM24 and raised questions regarding the reproducibility of the results, and showed that their tree topology based on the mixture model was significantly different compared to that of KIM24. However, we identified several issues in their study and demonstrate here that some arguments presented by WEA are based on weak grounds.

WEA, based on the reexamination of the partitioned maximum likelihood (ML) analysis using MrBayes dataset of KIM24 with IQ-TREE, argued that the reproducibility of the dataset from KIM24 is substantially low, as they observed that replicated runs often differed significantly in their tree topology. To check whether the tree topology of the KIM24’s study is reproducible, we generated 30 replicates of ML trees for both IQ-TREE and MrBayes datasets of KIM24 using the same partition and substitution model as in the original study using IQ-TREE v. 2.2.2.62 utilizing partition model3 and branch support evaluated by 2000 replicates of ultrafast bootstrap (UFBoot) trees4,5 and 1000 replicates of SH-aLRT tests6. In contrast to WEA, the subfamily-level topology of all 30 replicates from both datasets was identical, with only minimal intra-subfamily rearrangements (Fig. 1, refer to supplementary files).

Fig. 1: Phylogeny of Rhaphidophoridae.
figure 1

The topology is derived from a maximum likelihood tree based on MrBayes dataset (112 taxa with 3 taxa for outgroup) from KIM24 under partitioning. The node support values of the ML tree for the IQ-TREE dataset and the MrBayes dataset are each represented by colors in squares. A. Anoplophilinae, Aemo. aemodogryllinae, Ceu. ceuthophilinae, G. gammarotettiginae, Macro. macropathinae, Rh. rhaphidophorinae, T. tropidischinae.

As in WEA, we observed low branch support values in some parts of the tree obtained from the MrBayes dataset (Fig. 1). However, in the IQ-TREE dataset, UFBoot support values for the major clade groupings were high (≥95 in case of the tree with the best log-likelihood), except for the (Dolichopodainae+Troglophilinae) and (Anoplophilinae+Gammarotettiginae) pairs, exactly matching the bootstrap support patterns of KIM24. Here, the lower support value for the first grouping is likely caused by the dense sampling of closely related species7,8, whereas the lower support for the second grouping deserves more attention, as it is likely driven by the low data coverage of Gammarotettix genitalis. However, despite the high proportion of missing data, the phylogenetic placements of Gammarotettix were highly consistent across analyses with moderate support. The major difference between the MrBayes and IQ-TREE datasets of KIM24 is the outgroup sampling scheme: the MrBayes dataset included only a single outgroup species, whereas the IQ-TREE dataset included three outgroup species that formed a sister clade to Rhaphidophoridae. However, as shown from a previous phylogenomic study9, the divergence time between these two clades is relatively old. In fact, when only a single distant outgroup is used, as in the current case, a long stem branch arising from an outgroup can potentially result in a random rooting of the phylogeny, yielding lower branch supports at ingroup nodes10,11,12. Such a bias can be mitigated by using multiple outgroup species that could break distant relationships13. This explains why the branch supports obtained by WEA and the current analysis based on the MrBayes dataset are considerably lower than the results from the IQ-TREE dataset.

WEA also suggested a different subfamily-level topology based on the “better fitting” CAT-GTR mixture model14. It is widely known that evolutionary history between genes and sites can be heterogeneous, and failing to account for this can result in a biased estimation of phylogenetic parameters, including the tree topology15,16,17,18. Both partition models and mixture models have been shown to account for the gene-wise heterogeneity relatively well across a wide range of datasets19,20. However, both models have their pros and cons19,21, and thus claims about the superiority of one model over the other must rely on sound statistical justification.

WEA justified the superiority of their CAT-GTR tree topology based on relative model fit tests based on the Bayesian leave-one-out cross-validation (LOO-CV) and the widely applicable information criterion (wAIC)22,23, and also with an absolute model fit test based on the posterior predictive checking24,25 of CAT-GTR and GTR substitution models. The result of WEA shows that the model fit, based on all the above criteria, is much superior for the CAT-GTR model. However, their series of model testing has limited statistical value as one of the hypotheses considers a heterogeneous substitution process (CAT-GTR) while the other (GTR) does not. Therefore, the results merely show the presence of the heterogeneous substitution process within the dataset, an issue which was also pointed out by Boudinot et al.21. To claim the superiority of the CAT-GTR model over the partition model, which was used in KIM24’s analyses, one must directly compare the relative fit of the mixture model over the partition model. However, a comparison of the relative fit of the mixture versus partition models based on traditional model selection approaches can be problematic, as they are biased to favor partition models over mixture models26,27. Despite the availability of some potentially useful model selection criteria28,29, there seems to be no readily applicable methodology that can accurately compare the relative fit of the mixture models and partition models27. Thus, it may be premature to make strong claims about the superiority of either model until further theoretical and methodological advances that allow for a more thorough comparison of model fit between partition and mixture models. We also note that complex infinite mixture models, such as CAT-GTR, might not be the best way to model substitution heterogeneity of datasets like ours, as their computational complexity can far exceed the marginal benefit unless the divergence is exceedingly old, or the evolutionary process was extremely heterogeneous30.

Another thing to note is that the CAT-GTR model has been shown to be susceptible to missing data, especially when they are concentrated in a small number of taxa30,31,32, where, on the other hand, partition models remain relatively robust to incomplete taxa33,34. In this regard, the first-diverging position of Gammarotettiginae in the CAT-GTR tree of WEA must be critically viewed, as Gammarotettix genitalis was presented with a relatively high proportion of missing data. WEA also obtained the first-diverging position of Gammarotettiginae in their unpartitioned GTR tree, but this cannot be used to justify the obtained topology as failing to account for substitution heterogeneity is known to severely affect topology inference17. Collectively, WEA’s tree topology based on CAT-GTR cannot be a superior hypothesis to KIM24’s topology. As can be seen from the past studies and also from our concerns regarding Gammarotettiginae, the CAT-GTR model may not perform well under certain conditions30,31,32. Therefore, we argue that CAT-GTR cannot always be a “magic bullet” that can solve all phylogenetic problems.

It is true that the dataset used in KIM24 is far from perfect. Indeed, the positions of several subfamilies remain uncertain, and genomic-scale data would be needed to resolve a more robust phylogeny of Rhaphidophoridae. Thus, we are currently conducting a follow-up study using a morphology-based cladistics approach, as well as a phylogenomic study based on genomic data generated using the Orthoptera-specific target enrichment (OR-TE) probes35, to reconstruct a robust genome-scale phylogenetic tree. Through this genome-scale analysis, we believe that we will be able to obtain a more robust phylogeny of Rhaphidophoridae to elucidate their enigmatic evolutionary history.