Abstract
Protein function prediction is a fundamental cornerstone in bioinformatics, providing critical insights into biological processes and disease mechanisms. Despite significant advances, challenges persist due to data sparsity and functional ambiguity. We introduce GOHPro (GO Similarity-based Heterogeneous Network Propagation), a novel method that constructs a heterogeneous network by integrating protein functional similarity (derived from domain profiles and modular complexes) with GO semantic relationships. This method applies a network propagation algorithm to prioritize annotations based on multi-omics context. When evaluated on yeast and human datasets, GOHPro outperformed six state-of-the-art methods. Specifically, it achieved Fmax improvements ranging from 6.8 to 47.5% over methods like exp2GO across the Biological Process (BP), Molecular Function (MF), and Cellular Component (CC) ontologies in both yeast and human species. Rigorous case studies on proteins with shared domains, such as AAA + ATPases, demonstrated GOHPro’s ability to resolve functional ambiguity by leveraging contextual interactions and modular complexes. Further validation on the CAFA3 benchmark confirmed its generalizability, with Fmax gains exceeding 62% compared to baseline approaches in human species. Our analysis revealed that homology and network connectivity critically influence prediction robustness, with the modular similarity network compensating for evolutionary gaps in dark proteins. The framework’s extensibility to de novo structural predictions highlights its potential to bridge the annotation gap in uncharacterized proteomes.
Similar content being viewed by others
Introduction
Protein function prediction is a critical area of research in bioinformatics, playing a pivotal role in illuminating biological processes, unraveling disease mechanisms, and accelerating drug development. The accurate prediction of protein functions can dramatically speed up scientific discovery and assist in the development of novel therapeutic strategies. The importance of this field lies in its potential to bridge the widening gap between the ever-growing number of sequenced genomes and the functional annotation of the proteins they encode1. Traditionally, protein functions have been determined through experimental methods, including gene knockout studies, mutagenesis experiments, and biochemical assays. While these methods yield invaluable data, they are often time-consuming, labor-intensive, and impractical for large-scale analysis2. Moreover, they may fail to fully capture all potential protein functions, especially for proteins with multiple or complex functionalities.
The research landscape of protein function prediction has undergone significant transformation in recent years, fueled by the integration of diverse data sources and advanced computational techniques. Initially, protein function prediction heavily relied on the analysis of protein-protein interaction (PPI) networks, leveraging the principle that proteins with similar functions tend to interact with similar partners3,4. However, these early methods were hampered by the sparsity and noise inherent in PPI networks. To address these limitations, researchers have incorporated graph theory and complex network analysis into protein function prediction frameworks5,6,7. These approaches tap into the topological properties of PPI networks, such as node centrality, clustering coefficients, and community structures, to infer protein functions. For instance, Liang and colleagues proposed a function prediction method using protein overlap networks, demonstrating the effectiveness of graph-based methods in capturing functional relationships8.
More recently, there has been a surge of methods that combine multi-omics data to enhance the accuracy and robustness of protein function prediction. By merging information from various sources, such as gene expression profiles, protein domains, and evolutionary conservation, these methods aim to provide a more holistic understanding of protein functions. For example, Peng et al. improved protein function prediction by incorporating domain and protein complex information into PPI networks9. Sun and colleagues further extended this concept by utilizing function associations within PPI networks to predict protein functions10. Advancements in machine learning and deep learning have also revolutionized this field. Algorithms such as K-nearest neighbors, voting systems, and semantic similarity measures have been employed to refine protein function predictions11,12,13. Deep learning models, in particular, have shown promising results in capturing the intricate relationships between protein sequences, structures, and functions14,15.
Despite considerable progress, existing methods continue to face numerous challenges. The sparsity of protein-protein interaction (PPI) networks and the heterogeneity of multi-omics data pose significant hurdles by limiting the amount of information available for prediction. In our previous research, we successfully integrated PPI networks with multiple data sources to construct a heterogeneous biological network and developed effective methods for identifying essential proteins16 and predicting protein functions17,18 based on network propagation algorithms. These findings reveal that the fusion of heterogeneous networks with propagation algorithms not only effectively captures complementary information from diverse data sources but also substantially mitigates the negative impact of sparsity on prediction outcomes. On the other hand, protein annotation adheres to the true path rule19whereby a protein associated with a Gene Ontology (GO) category is annotated with all parent nodes of that GO term. Taking the human species as an example, we conducted a statistical analysis of the relationships between protein function annotations. The results indicate that, among the three GO categories of Biological Process (BP), Molecular Function (MF), and Cellular Component (CC), the proportions of proteins associated with GO Terms having “part_of” or “is_a” relationships are 96%, 91%, and 94%, respectively. Inspired by these findings, the present study takes into account the hierarchical structure of GO and integrates PPI networks with multiple data sources, such as protein-domain association data and protein complex data, to construct a more comprehensive heterogeneous network. Subsequently, we employ a network propagation algorithm to disseminate known functional information within this heterogeneous network and, based on this, devise a novel protein function prediction method named GOHPro.
Methods
The overall architecture of GOHPro is illustrated in Fig. 1. Firstly, a domain structural similarity network is constructed, leveraging both the original PPI (Protein-Protein Interaction) network and the protein domain profiles sourced from the Pfam database. Concurrently, a module similarity network is established by utilizing protein complex information. Subsequently, these two similarity networks are linearly merged to form a protein functional similarity network, as illustrated in Fig. 1(a). Following this, a GO (Gene Ontology) semantic similarity network is generated, based on the hierarchical structural relationships among GO Terms within the Gene Ontology framework, as depicted in Fig. 1(b). Then, the protein functional similarity network, the GO similarity network, and the GO Annotation are integrated to form a heterogeneous network, as shown in Fig. 1(c). Lastly, a propagation algorithm is applied to this heterogeneous network, and a list of GO terms for proteins of unknown function is produced. These GO terms are ranked in order of decreasing annotation probability, generated through a global diffusion of functional information. Further details regarding these modules are elaborated in the subsequent sections.
The flowchart of the GOHPro method: constructing protein functional similarity network based on domain structural and modular similarity networks, and integrating GO semantic similarity network for functional prediction. (a) Domain Structural & Modular Similarity Network Processing. Green nodes/network: Constructed from PPI Network and Protein Domain Profile, calculating domain similarity via Eq. (4). Blue nodes/network: Derived from Protein Complex analysis, calculating modular similarity via Eq. (6). Red nodes/network: Generated by linear merging (Eq. (7)) of the green (domain structural) and blue (modular) networks, with weight parameter γ. (b) GO Semantic Similarity Network Construction. Yellow triangles: Represent GO terms. The network is built by calculating GO semantic similarity based on Gene Ontology, using Eq. (8). (c) Heterogeneous Network Integration. Integrates the green (domain structural) network, blue (modular) network, red (protein functional) network, and yellow triangular (GO semantic) network into a Heterogeneous Network, linking with GO Annotations. (d) Heterogeneous Network Propagation. Employing Eq. (12) for network propagation: Visualizes the iterative process of predicting functions for unknown proteins (black nodes with “?“), leveraging connections in the heterogeneous network.
Heterogeneous network construction
A heterogeneous network typically consists of diverse networks and the links among nodes within these networks20. In this study, a heterogeneous network is constructed by connecting the protein functional similarity network with the GO semantic similarity network. Thus, the network construction is described as follows.
Let GP=(VP, EP, WP) denote a protein functional similarity network. It comprises the set of protein nodes (VP), the set of edges between proteins (EP), and the set of edge weights (WP), which signify the functional similarity values of the interactions between proteins. Similarly, GG=(VG, EG, WG) represents a GO semantic similarity network, where VG denotes the set of GO terms, EG indicates a set of the hierarchical relationship between GO terms, WG denotes its edge weights. Finally, the heterogeneous network for our protein-GO association prioritization is represented as a graph:
The construction of EPG and WPG is detailed in the ‘Protein-GO association network GPG’ section below. In contrast to our previous research work, this study constructs a two-layer heterogeneous network21 model, as illustrated in Fig. 1(c), which consists of the protein functional similarity network and the GO semantic similarity network. Note that GPG is an incomplete graph lacking association edges between proteins of unknown function and GO Terms. To infer more association edges, we will apply a network propagation algorithm, which will be explained in the next section.
Protein functional similarity network GP
To address the complexity of protein functions, we have comprehensively considered both the domain structural and modular characteristics of proteins. Our aim is to reconstruct a protein functional similarity network that overcomes the limitations of relying solely on interaction data. Such data can be affected by factors like data noise and incompleteness, which impede the full revelation of protein functional attributes. The specific approach is as follows: Firstly, a domain structural similarity network is constructed based on the topological structure of the protein interaction network and the composition of protein domains. Secondly, a modular similarity network is established based on the information and interaction relationships of functional protein complexes. Lastly, the aforementioned two similarity networks are linearly integrated to form a comprehensive protein functional similarity network.
Our measurement of the domain structural similarity between proteins consists of two parts, the contextual similarity, which indicates the domain-based structural similarity of proteins’ level-1 neighbours, and the compositional similarity, which indicates the proteins’ internal domain-based structural similarity. Given two proteins Pi and Pj, their context similarity is defined as follow:
DCi and DCj represent sets of distinct domain types in the neighboring proteins of Pi and Pj, respectively, which are referred to as their domain contexts. Similarly to DSim_context, the domain compositional similarity can be defined as follows:
where Di and Dj denote the set of different domain types of the proteins of Pi and Pj themselves, respectively. The domain structural similarity between two proteins is a linear combination of their contextual similarity and compositional similarity, which can be formally described as follows:
We tested β values (0.1–0.9) and validated β = 0.1 (from Peng et al.9) as optimal for balancing contextual and compositional similarities.
To calculate the modular similarity of a pair of proteins, we obtain a protein complex dataset from the Complex Portal22. The Complex Portal is mainly a manually curated, encyclopedic resource of macromolecular complexes derived from physical molecular interaction evidences extracted and cross-referenced from the literature. Given a complex Ci in the benchmark set, its functional score can be obtained as follow:
The functional score S(Ci) is calculated using the hypergeometric distribution, which quantifies the probability of observing j successes (functionally characterized proteins) in n draws (proteins in complex Ci) via non-replacement sampling from a finite population. In this context, N represents the total number of proteins in the dataset, and M denotes the total number of functionally characterized proteins. Additionally, n indicates the size of complex Ci, while k signifies the number of functionally characterized proteins within Ci. This formula assesses whether the presence of k functionally characterized proteins in Ci is statistically significant (i.e., not a result of random chance). If S(Ci) is extremely low, it indicates that the complex Ci is enriched for functionally characterized proteins. Biologically, this suggests the complex likely participates in specific biological processes/pathways, as the co-occurrence of functional proteins is unlikely to be random. For a pair of proteins Pi and Pj, their modular similarity is defined as follows:
The modular similarity between proteins Pi and Pj is computed as the maximum overlap between complexes containing Pi and Pj. The notation Ci∩Cj denotes the set of protein complexes that concurrently include both protein Pi and Pj. The modular similarity score and domain structural similarity score are normalized to obtain MSim_normal and DSim_normal, respectively. These normalized scores are linearly combined to obtain the functional similarity value of a pair of proteins, which is defined as follows:
The tuning of γ is discussed in the ‘Parameter Analysis’ subsection.
Protein-GO association network GPG
The association between proteins and GO terms is directly established based on GO annotations. If protein pi is annotated with GO gj, pi connects to gj with an edge in the protein-GO association network then WPG (i, j) = 1, otherwise there is no edge between them and WPG (i, j) = 0. For the proteins whose functions are unknown, there will be missing values in the corresponding rows of the protein-GO association matrix. These missing values will be filled using a network propagation algorithm, and the filled values represent the probability that a protein is annotated with a particular GO Term.
GO-GO semantic similarity network GG
Given a GO term gi, gi and its ancestors within the Gene Ontology database can be represented as DAGi = (Ti, Ei) where Ti is the set of GO terms in DAGi, including gi and all of its ancestor GO terms, and Ei is the set of semantic relations connecting the GO terms in DAGi. While the Gene Ontology hierarchy includes diverse relationships, our analysis focuses solely on ‘is_a’ and ‘part_of’ relations, which collectively form a directed acyclic graph (DAG). We evaluated methods considering all DAG terms (as in [23]), but found that direct ancestor-child relationships provided higher robustness, likely due to reduced noise from distant terms. Therefore, we define the semantic similarity value between two given GO term gi and gj as follows:
The semantic contribution factor sc quantifies the weight of hierarchical relationships between GO terms. Specifically, sc assigns higher weights to ‘is_a’ relations (sc = 0.4) compared to ‘part_of’ relations (sc = 0.3), reflecting their stronger functional inheritance in the GO hierarchy23. These values are empirically validated in prior studies23where ‘is_a’ edges were shown to propagate functional annotations more directly than ‘part_of’ edges. In Eq. (8), sc directly scales the contribution of each parent-child relationship to the semantic similarity between GO terms.
Heterogeneous network propagation
The heterogeneous network created in the previous section can be represented as a graph HG= {{VP, VG}, {EP, EPG, EG}, {WP, WPG, WG}}. The objective is to capture hidden interactions between proteins and GO terms based on the protein-protein functional similarities, GO-GO semantic similarities, and known protein-GO interactions. Based on this graph, the problem of annotating functions to proteins with unknown functions is transformed into the problem of predicting new protein-GO edges in the graph. First, we normalize the edges weight matrices WP and WG according to the following formula. This normalization step is crucial, as it ensures the convergence of the network propagation algorithm.
Then, based on the guilt-by-association assumption, we calculate the initial weights of unconnected protein-GO pairs using the following Eq.
To capture global functional information, we run a propagation algorithm on the heterogeneous network based on Eq. 12, iteratively updating the weights of protein-GO pairs.
In the above formula, \(W_{{PG}}^{0}={W_{PG}}\)represents the initial association between protein-GO pairs, where α is the decay factor with a value ranging from 0 to 1. In each iteration, the original protein-GO associations contribute to the newly constructed associations, and this contribution is controlled by the proportion factor 1-α. In this study, the value of α is fixed at 0.1.
When the iteration reaches a stable state, the values in the corresponding row of the weight matrix WPG for a protein with unknown function will be regarded as the probabilities of that protein being annotated with GO terms. After sorting these annotation probabilities in descending order, we can obtain a predicted function list for the target protein u. Ultimately, we select the top L functions from this list as the candidate functions for the target protein u. In this study, L is determined by the number of GO terms of the known-function protein that has the highest functional similarity value with u in the functional similarity matrix WP.
Results
Datasets and evaluation metrics
We evaluated our proposed method, GOHPro, alongside six state-of-the-art protein function prediction methods: exp2GO14, GrAPFI-GO13, PHN18, DCS9, NC3, and DeepGo-SE24. This evaluation was conducted on two species, Saccharomyces cerevisiae and Homo sapiens, which chosen due to their extensive and comprehensive GO annotations covering the majority of their proteins. The protein-protein interaction (PPI) networks are derived from the Biological General Repository for Interaction Networks (BioGRID) datasets25, released on October 1, 2024, with duplicate and self-interactions removed. Proteins are named according to the UniProt Knowledgebase (UniProtKB)26 format. Table 1 presents a comparison of the PPI networks’ size and topological structure between Saccharomyces cerevisiae and Homo sapiens, highlighting significant differences. The yeast network is more compact and efficient, whereas the human network is larger in scale but relatively sparse. These differences are likely to impact the performance of protein function prediction methods that rely on these networks.
The Gene Ontology (GO) and GO annotations were downloaded from the GO Consortium27, with data versioned on September 8, 2024. Only experimental annotations with the following evidence codes were retained: IDA, IPI, EXP, IGI, IMP, IEP, IC or TA. These annotations were further propagated upwards based on the ‘is_a’ and ‘part_of’ relationships in the hierarchical structure of GO, with root GO terms excluded. The protein-domain association data was downloaded from the Pfam28 database, version 37.0, released on May 28, 2024. We utilized the PFAM-A classification owing to its superior quality and accuracy. The protein complex data was obtained from the Complex Portal22 database, with the version selected being that released on November 30, 2024. The species IDs for Yeast and Human are 559,292 and 9606, respectively. Table 2 outlines the number of domain types, complexes, and functional annotations (GO annotations) for Yeast (Saccharomyces cerevisiae) and Human (Homo sapiens), respectively.
We evaluate our GOHPro method and competing methods using four evaluation metrics: AUC (Area Under the Curve), AUPR (Area Under the Precision-Recall Curve), Smin29 and Fmax. The first two metrics are commonly used for evaluating multi-label classification tasks. Fmax is suggested by the CAFA (Critical Assessment of Function Annotation)30, represents the maximum protein-centric F-measure computed over all prediction thresholds.
Overall performance of GOHPro
The overall performance of GOHPro was evaluated using the GO annotation dataset from the GO Consortium, comparing it with six other state-of-the-art methods through the Leave-One-Out methodology31. Each protein in the dataset was iteratively designated as a test sample, while the remaining proteins served as the training set. This process was repeated for all proteins, and the prediction outcomes were aggregated to compute average performance metrics, including Fmax, AUPR, AUC, and Smin. For the Yeast dataset, the total sample counts for Biological Process (BP), Molecular Function (MF), and Cellular Component (CC) were 2690, 1645, and 1866, respectively. For the Human dataset, these counts were 2382, 1873, and 1950, respectively. Figures 2(a) and 2(b) illustrate the comparison of Fmax scores achieved by GOHPro and the six rival methods on the Yeast and Human datasets.
The predicted results of various methods using leave-one-out cross validation. The figure shows the protein function prediction results obtained by seven methods using leave-one-out cross-validation, with GO Annotation as the benchmark. The numbers on the bars represent the Fmax (top row) or Smin (bottom row) values for different methods across BP, MF, and CC categories. (a) Yeast (Saccharomyces cerevisiae) dataset (b) Human (Homo sapiens) dataset.
As shown in Fig. 2, GOHPro demonstrated superior performance across all categories for both species. In Yeast, GOHPro achieved the highest Fmax scores in BP, MF, and CC. Compared to exp2GO, GOHPro showed significant improvements: approximately 6.8% in BP (from 0.421 to 0.45), 15.9% in MF (from 0.473 to 0.548), and 7.2% in CC (from 0.541 to 0.58). In Human, despite the more extensive and sparser PPI network, GOHPro retained the top Fmax in all three categories. Compared to exp2GO, improvements were 21.4% in BP (from 0.215 to 0.261), 47.5% in MF (from 0.303 to 0.447), and a modest increase in CC (from 0.345 to 0.354). Relative to DeepGO-SE, which relies on semantic entailment, GOHPro’s integration of modular complexes and domain context provided over 6% improvements in Fmax for Human annotations.
Regarding the Smin metric, which evaluates the minimum size of the prediction list required to achieve a certain level of recall, lower Smin values indicate better performance. In the Yeast dataset, GOHPro achieved Smin values of 3.27 for BP, 1.151 for MF, and 1.369 for CC, outperforming all other methods. In the Human dataset, GOHPro’s Smin values were 3.339 for BP, 2.744 for MF, and 2.285 for CC, again showing superior performance compared to other methods.
To provide a more comprehensive comparison, we plotted Precision-Recall (PR) curves and Receiver Operating Characteristic (ROC) curves for all methods on both datasets. We calculated the Area Under the PR Curve (AUPR) and the Area Under the ROC Curve (AUC) for both types of curves and labeled them in Figs. 3 and 4, respectively. In the BP category, on the Yeast dataset, GOHPro’s AUPR was significantly higher than other methods. On the Human dataset, GOHPro’s AUPR improvements over other methods were even more pronounced. Additionally, GOHPro achieved competitive AUC scores, ranking top in BP and MF, and CC.
Comparisons precision-recall performance among different methods across GO terms in various ontologies. (a) Precision-recall curves illustrate the performance of different methods in predicting functions for proteins in the Yeast dataset. The numbers attached to the labels represent the AUPR values of each method. (b) The PR curves for seven methods applied to the Human dataset, with the corresponding AUPR values displayed on the labels.
Comparisons ROC curves among different methods across GO terms in various ontologies. (a) ROC curves depict the effectiveness of various methods in forecasting protein functions within the Yeast dataset. The numerical values associated with the labels indicate the AUC scores for each method. (b) For the Human dataset, ROC curves are presented for seven different methods, with the respective ROC values clearly marked on the labels.
The introduction of DeepGO-SE as a new comparison method and the addition of the Smin evaluation metric further validated GOHPro’s robustness and accuracy. These results collectively indicate that GOHPro is a superior method for protein function prediction, offering significant advantages over existing approaches.
Functional validation and sensitivity analysis
Validation of function prediction robustness and specificity
To rigorously evaluate GOHPro’s ability to resolve functional ambiguity and capture critical determinants of protein activity, we conducted two complementary experiments: (1) cross-domain functional validation and (2) function sensitivity analysis. These experiments address key challenges in protein function prediction, such as distinguishing proteins with shared domains but divergent functions and identifying essential domains and interactions that underpin specific annotations.
(1) Cross-Domain Functional Validation.
Proteins with shared structural domains often diverge in function due to contextual differences in interactions or subcellular localization32. To evaluate GOHPro’s ability to resolve such ambiguity, we analyzed four proteins harboring the AAA + ATPase domain (IPR003959) but annotated with distinct biological roles (Table 3). Human peroxisomal ATPase PEX6 (Q13608) and human transitional ER ATPase VCP (P55072) possess the AAA + ATPase domain but operate in distinct subcellular compartments. Yeast origin recognition complex subunit ORC1 (P54784) and yeast m-AAA protease YTA12 (P40341) share AAA + domains but participate in DNA replication and mitochondrial proteostasis, respectively.
GOHPro’s predictions aligned with manually curated GO annotations (Table S1) and literature evidence. Table 4 lists the prediction results of our GOHPro method for validation proteins, including correctly matched functions, precision and recall. For PEX6, the model correctly prioritized peroxisome organization (GO:0007031, precision = 1.0) and ATP hydrolysis (GO:0016887, recall = 1.0), consistent with its role in peroxisomal protein import via interactions with PEX5 and the retrograde translocation channel33,34. VCP’s predictions emphasized positive regulation of protein catabolic process (GO:0045732, precision = 0.33) and ATP hydrolysis activity (GO:0016887, recall = 1.0), reflecting its ATP-dependent mechanical force collaboration with the ubiquitination system in cellular processes like DNA repair and cell division35,36. ORC1’s DNA replication initiation (GO:0006270, recall = 0.75) and chromatin binding (GO:0003688, precision = 1.0) matched its dual function in chromosome replication and mating-type transcription silencing through binding to autonomously replicating sequence (ACS) at replication origins37. YTA12’s mitochondrial inner membrane (GO:0005743, precision = 1.0) and signal peptide processing (GO:0006465, recall = 1.0) validated its role as a catalytic component of m-AAA protease in mitochondrial inner membrane protein homeostasis, combining ATPase and protease activities38,39. Despite shared domains, GOHPro leverages contextual features (e.g., interaction partners, protein complexes) to assign distinct functions. For example, PEX6’s peroxisomal interactions and VCP’s ATP-related partners drove divergent predictions, demonstrating the model’s ability to integrate multi-omics data for functional disambiguation.
(2) Function Sensitivity Analysis.
To assess GOHPro’s reliance on critical functional determinants, we systematically perturbed input data for these proteins (Tables 5–6). For PEX6 (Q13608), removing the peroxisomal receptor export module interactions (Deletion of peroxisomal receptor export module complex) abolished “peroxisome organization” predictions in the BP category, resulting in a complete precision drop (−100%) and a 50% recall decline (Table 5, Q13608). However, deleting its AAA + domain (e.g., IPR003960) showed no impact on ATP hydrolysis activity annotations in the MF category, with both precision and recall remaining unchanged (Table 5, Q13608). This highlights that while interaction context is critical for peroxisomal organization predictions, the AAA + domain alone is not the sole determinant of ATP hydrolysis annotations. For VCP (P55072), the deletion of domains IPR003959 (AAA + ATPase), IPR041569, IPR004201, and IPR003339 resulted in a 50% precision drop and a complete recall loss (−100%) for Molecular Function (MF) annotations (Table 5, P55072). Specifically, this perturbation caused “ATP hydrolysis activity” (GO:0016887) predictions to fail entirely (No functions matched), highlighting the critical role of these domains in maintaining both the specificity (precision) and coverage (recall) of enzymatic function predictions. In ORC1 (P54784), disrupting DNA replication initiation via deletion of the Nuclear origin recognition complex decreased the precision of “DNA replication initiation” (GO:0006270) by 66.7% and recall by 25% in the BP category (Table 6, P54784). This emphasizes the dependency on replication-specific network integrity for accurate functional annotation. For YTA12 (P40341), removing interactions with the m-AAA protease complex degraded “signal peptide processing” predictions in the BP category, with a 33.3% precision drop and a 50% recall decline (Table 6, P40349). This underscores the necessity of complex membership for context-specific functional predictions. An interesting phenomenon to note is that for VCP (P55072), deleting the AAA ATPase complex marginally improved precision (+ 33.3%) and recall (+ 12.5%) for CC annotations. This unexpected improvement arises from the release of the previously masked ‘proteasome complex’ (GO:0000502) annotation. While VCP interacts transiently with ubiquitinated substrates targeted for proteasomal degradation, it is not a structural component of the proteasome itself (Supplementary Table 1). Thus, this annotation was non-specifically associated due to indirect interactions rather than direct complex membership. This highlights the challenge of distinguishing functional context from transient interactions in protein function prediction.
These perturbations reveal that GOHPro prioritizes function-critical features over generic domain composition. For example, removing the AAA + domain abolished ATP-related predictions, while interaction masking disrupted pathway-specific annotations. This aligns with biological principles where catalytic residues or interaction partners are indispensable for function34,39.
Impact of protein homology and network connectivity on prediction performance
To systematically evaluate the influence of protein homology and network connectivity on function prediction, we conducted a comparative analysis across three homology groups: high-homology (strong module/domain structural similarity), mid-homology (moderate similarity), and dark proteins (low homology). Performance metrics (Precision, Recall, F-value) and robustness tests (network removal experiments) were analyzed for Biological Process (BP), Molecular Function (MF), and Cellular Component (CC) ontologies. Functional similarity for homology grouping was calculated using a combination of domain compositional similarity (Eq. 4) and modular complex overlap (Eq. 6), with thresholds set at ≥ 70% (high), 40–70% (mid), and < 40% (dark) to stratify proteins based on multi-omic contextual similarity. In this paper, ModuleNet denotes the modular similarity network derived from protein complex modularity (Eq. 6), and DomainNet denotes the domain structural similarity network based on Pfam domain profiles (Eq. 4). Δ represents the mean difference between group pairs, the negative values of which indicate lower performance in the first group.
(1) Homology-Driven Performance Differences.
To evaluate the impact of homology on functional prediction performance (Precision/Recall), a one-way analysis of variance (ANOVA)40 was employed to compare the mean differences among high-homology, mid-homology, and low-homology groups. The mathematical model of one-way ANOVA is:
In the model, yij denotes the j-th observation (precision or recall) in the i-th group (high-, medium-, or low-homology group); µ represents the overall mean; ai is the effect of the i-th group (between-group difference); and εij is the random error term, which is normally distributed with a mean of 0 and variance of \({\sigma ^2}\). The null hypothesis H0 states that all group means are equal, while the alternative hypothesis H1 posits that at least two group means are different. Tables 7 and 8 present the results of one-way ANOVA for DomainNet and ModuleNet across the Biological Process (BP), Molecular Function (MF), and Cellular Component (CC) ontologies in yeast and humans, respectively. As shown in Tables 7 and 8, significant differences (p < 0.05) were observed across Gene Ontology (GO) categories and species. In BP, ModuleNet demonstrated extreme divergence in precision (Yeast: F = 65.50, p = 0; Human: F = 41.07, p = 0), with high-homology groups outperforming mid-homology and dark groups. DomainNet precision showed no significance in BP (Yeast: p = 0.7423), suggesting that domain structural similarity alone fails to capture complex biological pathways, which rely more on modular interactions. In CC, DomainNet recall was significant in both species (Yeast: F = 6.33, p = 0.002; Human: F = 4.83, p = 0.0087), highlighting the critical role of domain structural features in predicting protein function. However, ModuleNet still dominated CC precision (Yeast: F = 63.44, p = 0; Human: F = 50.44, p = 0), indicating that modular similarity (MSim) are primary drivers for cellular function annotation. For MF, ModuleNet precision showed species-consistent superiority (Yeast: F = 8.66, p = 0.0002; Human: F = 29.75, p = 0), while DomainNet exhibited limited significance (Human precision: p = 0.2389). These results demonstrate that ModuleNet’s performance is robust across species and ontologies, whereas domain structural similarity (DomainNet) contributes marginally except in specific contexts.
Taking into account that ANOVA results were significant (p < 0.05), a post-hoc Tukey’s Honestly Significant Difference (HSD) was selected over Bonferroni correction due to its higher statistical power in balanced designs and robust control of the family-wise error rate (FWER) in pairwise comparisons41. The test statistic for Tukey’s HSD (Honestly Significant Difference) test is based on the studentized range distribution. The test statistic, often denoted as q, is calculated as follows:
Where \(\overline {{{y_i}}}\) and \(\overline {{{y_j}}}\) are the sample means of groups i and j. MSE is the mean square error obtained from the ANOVA and n is the number of observations per group. The critical value qcritical is derived from the studentized range distribution with degrees of freedom df = N – k (N: total observations, k: number of groups). If |q| > qcritical, the null hypothesis assuming no differences between groups is rejected. The confidence interval for mean differences is:
In this formula, ni and nj are the sample sizes of the two groups. Following significant ANOVA results (p < 0.05), post-hoc Tukey’s HSD tests were performed to identify pairwise differences between High-homology, Mid-homology, and Dark protein groups (Tables S2-S3). These tests revealed significant differences (p < 0.05) in precision and recall across homology groups (Fig. 5). For example, in the Biological Process (BP) category, ModuleNet exhibited significantly lower precision in the high-homology group compared to the low-homology group (Δ=−0.455, p < 0.001 for Yeast; Δ=−0.444, p < 0.001 for Human), where Δ represents the mean difference (High – Low). This indicates reduced efficacy of modular similarity in predicting functions for evolutionarily conserved proteins. Conversely, DomainNet showed no significant differences in BP precision across homology groups (p ≥ 0.7736 for Yeast; p ≥ 0.089 for Human), suggesting limited utility of domain structural similarity for complex pathway annotations.
Performance Comparison of Homology Groups on Yeast and Human Datasets. (a) Precision and recall of high-homology (blue), mid-homology (green), and dark (orange) groups in yeast protein function prediction (BP/MF/CC combined). Horizontal bars denote significant differences (Tukey’s HSD, p < 0.05). (b) Homology group performance on the human dataset. Boxplots display median (center line), interquartile range (boxes).
In Molecular Function (MF), high-homology proteins outperformed dark proteins in ModuleNet (Δ=−0.209, p = 0.0044 for yeast; Δ=−0.42, p < 0.001 for human), emphasizing the critical role of modular interactions in predicting enzymatic activities. However, DomainNet exhibited counterintuitive trends: high-homology groups had lower recall than dark proteins (Δ=−0.115, p = 0.0083 for Yeast; Δ=−0.062, p = 0.1232 for Human), likely due to shared domains masking functional divergence. For Cellular Component (CC), ModuleNet precision declined significantly for conserved proteins (Δ=−0.454, p < 0.001 in Yeast; Δ=−0.469, p < 0.001 in Human), whereas DomainNet improved predictions for dark proteins (Δ=−0.128, p = 0.0097 in Yeast; Δ=−0.088, p = 0.0126 in Human). These results highlight the complementary roles of DomainNet (domain structural conservation) and ModuleNet (modular context) in addressing annotation challenges across homology groups.
(2) Network Connectivity and Robustness.
To evaluate how network sparsity impacts prediction robustness, we systematically removed edges from DomainNet, ModuleNet, and their combination (DomainNet + ModuleNet) in 10% increments (from 10 to 90%) for both High-homology and Dark protein groups. While the homology analysis included three groups (High, Mid, and Dark) to capture performance gradients, the network connectivity experiments focused on High vs. Dark groups to highlight the most divergent dependencies. Mid-homology proteins were excluded from network removal benchmarks due to their transitional behavior, which requires dedicated analysis beyond the scope of this work. To analyze the results, three visualization methods were employed: Fmax decay curves (Fig. 6) were used to track the decline in performance as the proportion of edges removed increased, illustrating the robustness of the model under different experimental conditions. Performance decline heatmaps (Fig. 7) quantified the percentage loss in performance after 90% edge removal, providing a visual summary of how edge removal affects prediction accuracy. Decay slope comparisons (Fig. 8) measured the rate of performance decay by calculating the slope of Fmax relative to the removal ratio, offering insights into the model’s resilience under edge perturbations.
Fmax Decay Curves Across Network Removal Conditions. This figure illustrates the robustness of GOHPro in protein function prediction under varying levels of network connectivity loss. Lines represent the decline in Fmax for DomainNet removal, ModuleNet removal, and their combination (DomainNet + ModuleNet removal) across different homology groups. (a) Yeast dataset. (b) Human dataset.
Performance Decline Across Network Removal Ratio. This figure illustrates the percentage decline in Fmax across different Gene Ontology (GO) categories under varying network removal conditions. The heatmaps compare the performance loss for DomainNet (green) removal, ModuleNet (purple) removal, and DomainNet + ModuleNet (red) removal. (a) Yeast dataset (b) Human dataset.
Performance Decay Slope Comparison with Network Removal Ratio. This figure compares the rate of performance decay (slope = Fmax/removal ratio) under three experimental conditions (DomainNet removal, ModuleNet removal, and DomainNet + ModuleNet removal) across three ontologies (BP, MF, CC). (a) Yeast dataset (b) Human dataset.
Our results revealed stark disparities between homology groups: Dark proteins exhibited significantly steeper performance decay across all ontologies (Fig. 6). For instance, in Yeast Biological Process (BP), 90% removal of edges in DomainNet + ModuleNet reduced Fmax by 76% (from 0.191 to 0.045) for Dark proteins, whereas High-homology groups retained 64% of their original performance (from 0.59 to 0.376). Similarly, in Human Cellular Component (CC), Dark groups experienced a 62% decline (from 0.091 to 0.035), contrasting the 33% drop (from 0.429 to 0.286) in High-homology proteins. These findings underscore the heightened dependency of Dark proteins on dense network connectivity for functional inference, likely due to their lack of conserved interaction patterns.
The impact of connectivity loss varied markedly across Gene Ontology (GO) categories. BP emerged as the most sensitive ontology, reflecting its hierarchical and context-dependent annotation structure. For Dark yeast proteins, DomainNet removal alone triggered a 48.5% decline in BP Fmax, surpassing losses in (MF: 34.1%) and (CC: 19.9%) (Fig. 7). Conversely, CC showed the least decay in High-homology groups (≤ 16% loss), likely due to the strong utility of structural similarity in protein function predictions. This ontology-specific divergence highlights the need for tailored network design to address the unique annotation challenges of each GO category.
The contributions of DomainNet and ModuleNet to prediction robustness diverged significantly. DomainNet removal disproportionately impacted Dark groups. For example, Yeast CC Fmax dropped 76% in Dark proteins versus 33% in High-homology groups. This aligns with post-hoc Tukey’s HSD results, where DomainNet exhibited recall bias for conserved proteins. In contrast, ModuleNet removal had negligible effects (≤ 0.7% decline), validating its homology-agnostic robustness. When both networks were ablated, synergistic degradation occurred (Fig. 6), emphasizing their complementary roles: DomainNet captures structural conservation, while ModuleNet enables de novo functional inference through modular interactions.
To further quantify the rate of performance decay, we compared the decay slopes across different experimental conditions (Fig. 8). The decay slope, calculated as the ratio of Fmax loss to the removal ratio, provided a measure of how rapidly performance declined with increasing network sparsity. For Dark proteins, the decay slopes were consistently steeper across all ontologies and experimental conditions, indicating a higher sensitivity to network connectivity loss. For instance, in Yeast BP, the decay slope for Dark proteins was − 1.9 per 10% removal in DomainNet + ModuleNet removal, compared to −0.7 for High-homology groups. Similarly, in Human CC, the decay slope for Dark proteins was − 1.4, while High-homology groups showed a slope of −0.5. These results further support the notion that Dark proteins are more reliant on dense network connectivity for accurate functional inference.
The vulnerability of Dark proteins to connectivity loss stems from their reliance on dense networks to propagate functional information in the absence of evolutionary conservation. High-homology proteins, conversely, leverage shared domain profiles and conserved interaction patterns to compensate for sparse networks. The hierarchical complexity of BP annotations, which requires context-rich modular interactions, explains their heightened sensitivity to edge removal. These insights advocate for multi-layered network integration, combining evolutionary signals (DomainNet) and modular context (ModuleNet), to address the annotation challenges of both conserved and poorly characterized proteins.
Our results demonstrate that network connectivity critically shapes prediction robustness, particularly for Dark proteins. While ModuleNet’s design mitigates dependency on sparse data, DomainNet exacerbates performance decay in conserved proteins due to structural redundancy. These findings underscore the necessity of ontology-specific network architectures and hybrid data integration strategies to advance functional annotation, especially for targets lacking evolutionary signatures. Future work will extend this framework to incorporate de novo structural predictions, bridging the gap between sequence and function for unannotated proteins.
Parameter analysis
When constructing the protein functional similarity network, we introduce a parameter γ to modulate the weight of structural similarity and module similarity values in our calculations. To assess the sensitivity of our GOHPro method to this parameter γ, we present in Fig. 9 the experimental results of the Fmax score under varying values of γ for the BP, MF, and CC branches in both Yeast and Human datasets. From Fig. 9, it is evident that for the BP branch, as the parameter increases from 0 to 0.8, the Fmax steadily climbs, reaching peak values of 0.45 and 0.261 for the Yeast and Human datasets, respectively. Subsequently, the Fmax score experiences slight fluctuations before declining. In the MF branch, the Fmax for the Yeast dataset attains a relatively high level of 0.548 when the parameter is 0.6, and for the Human dataset, it peaks at 0.447 at the same parameter value, followed by an overall downward trend. In the CC branch, the Fmax consistently rises with increasing parameter values, reaching its highest points of 0.58 for the Yeast dataset and 0.354 for the Human dataset at a parameter value of 0.9. Based on these observations, we conclude that the optimal values of the parameter γ in the BP, MF, and CC branches are 0.8, 0.6, and 0.9, respectively. The optimal γ values (0.8 for BP, 0.6 for MF, 0.9 for CC) reflect the differing contributions of domain structural and modular similarities across ontologies. Biological Processes (BP), which depend on dynamic complex interactions, required higher modular similarity weight (γ = 0.8), while Molecular Function (MF) balanced domain (catalytic activity) and modular (partner interactions) contributions (γ = 0.6), aligning with their dual reliance on structural and interactive features.
The analysis of parameters γ. The figure illustrates the impact of the parameter γ on the overall performance of the GOHPro method. It provides a visual representation of how varying the value of γ influences the effectiveness or accuracy of GOHPro in carrying out its intended task.
Testing on targets of the third CAFA challenge (CAFA3)
To assess whether the varying performances of the methods under evaluation, and the superiority of GOHPro, were intrinsic to the algorithms or attributed to differences in training sets, we re-executed our method alongside other competing methods and undertook retraining using the third Critical Assessment of Protein Function Annotation (CAFA3) dataset. In this experiment, we utilized proteins with functional annotations from the CAFA3 dataset as our training set, and proteins from the benchmark dataset used previously (excluding those in the CAFA3 dataset) as our test set. We then re-ran seven function prediction methods and compared their performances.
Table 9 presents the sizes of the training and test sets for Yeast and Human species across the three branches: BP, MF, and CC. A quick calculation based on Table 9 reveals that for the Yeast species, the ratios of the test set to the training set in the BP, MF, and CC branches are 0.08:1, 0.46:1, and 0.27:1, respectively. For the Human species, these ratios are 0.04:1, 0.18:1, and 0.22:1, respectively. Across species, the test set ratios for the Yeast species are higher than those for the Human species. Across branches, the proportion of the test set in BP does not exceed 0.1.
Table 10 presents a comparison of the functional prediction performance of seven methods on the CAFA3 dataset, including the four metrics: Fmax, Smin, AUPR, and AUC. For the three branches of Yeast species (BP, MF, and CC), GOHPro achieved the highest Fmax values among all methods, reaching 0.473, 0.601, and 0.481, respectively. The improvement ratios over the exp2GO method surpassed 105% in all categories. For the Human species, GOHPro’s Fmax values in the BP, MF, and CC branches were also higher than those of all competing methods, attaining 0.174, 0.43, and 0.294, respectively. Although the improvement ratios compared to the exp2GO method were not as high as those for the Yeast species, they still exceeded 83%.
Furthermore, by comparing the experimental results of the leave-one-out method from the previous section with those of the current experiment, we found that due to alterations in the training and test sets, as well as noise and other factors, the predictive performance of the seven methods diminished to varying degrees. To investigate this further, we compared the sizes of the training sets used in the two experiments, as shown in Table 11. For the Yeast species, there was minimal change in the size of the training set between the two experiments; however, for the Human species, the sizes of the training sets in the CAFA3 dataset increased by 116.42%, 66.08%, and 46.64% for the BP, MF, and CC branches, respectively. We believe this is one of the reasons why the performance decline of all methods was relatively smaller for the Human species. Among all methods, the GOHPro method exhibited the least performance decline on the CAFA3 dataset, indicating its robust nature.
Discussion
Proteins interact with a vast range of molecules, playing a crucial role in almost every vital biological process. Understanding protein function is essential for unraveling the molecular basis of health, disease, evolution, and organismal functionality. However, combining protein functional attributes with multi-omics data remains a critical challenge in computational function prediction. Here, we present GOHPro, a heterogeneous network propagation method that integrates GO semantic hierarchies, protein-protein interactions (PPIs), domain profiles, and modular complexes. Evaluated across yeast and human datasets, GOHPro outperformed seven state-of-the-art methods, achieving up to at least 11% for yeast and 19% for human species Fmax improvement in Molecular Function (MF). Its robustness was validated on the CAFA3 benchmark, with Fmax gains exceeding 62% in human species, showcasing exceptional accuracy and generalizability.
The functional validation experiments underscore GOHPro’s capacity to disentangle functional ambiguity in proteins with shared domains. For instance, despite harboring identical AAA + ATPase domains, PEX6 and VCP were accurately annotated to peroxisomal organization and ATP hydrolysis activity, respectively, by leveraging context-specific interactions (e.g., PEX6-PEX5 complexes). This aligns with prior studies showing that functional divergence often stems from modular context rather than domain composition alone. Furthermore, the sensitivity analysis revealed critical dependencies: removing domain or interaction features (e.g., PEX6’s peroxisomal receptor module) abolished specific annotations, emphasizing the necessity of multi-omics integration. Notably, transient interactions (e.g., VCP-proteasome associations) occasionally introduced noise, highlighting challenges in distinguishing direct functional roles from indirect linkages, a limitation partially mitigated by our modular similarity weighting (Eq. 6).
The homology-driven performance disparities further illuminate GOHPro’s strengths. While ModuleNet excelled in annotating conserved proteins (e.g., ORC1’s DNA replication role), DomainNet enhanced predictions for dark proteins by compensating for sparse evolutionary signals. This synergy suggests that future methods should adopt hybrid strategies, combining evolutionary conservation with de novo interaction networks, especially for understudied proteomes. However, GOHPro’s reliance on dense networks poses challenges for proteins with sparse PPIs, such as dark proteins in the CAFA3 test set (human BP Fmax decreased from 0.261 to 0.174). Future work will address this by integrating sequence embeddings (e.g., transformer-based protein language models) and de novo structural predictions (e.g., AlphaFold 3D models), enabling function inference for unannotated sequences. The modular design of GOHPro’s heterogeneous network framework ensures extensibility, positioning it to bridge the annotation gap for proteomes lacking evolutionary or interaction data.
In addition to its superior performance, GOHPro demonstrates significant advantages in resolving functional ambiguity and capturing critical determinants of protein activity. For example, in the case of PEX6 and VCP, GOHPro accurately distinguishes their distinct biological roles by integrating contextual features such as interaction partners and protein complexes. This capability is crucial for understanding the diverse functions of proteins with shared structural domains, which is a common challenge in protein function prediction. The analysis of homology groups reveals that high-homology proteins benefit from conserved interaction patterns, while dark proteins rely more on dense network connectivity for accurate functional inference. This finding highlights the importance of multi-layered network integration, combining evolutionary signals (DomainNet) and modular context (ModuleNet), to address the annotation challenges of both conserved and poorly characterized proteins.
In summary, GOHPro represents a significant advancement in protein function prediction by effectively integrating multi-omics data within a heterogeneous network framework. Its ability to leverage contextual interactions and modular complexes makes it particularly effective in resolving functional ambiguity and improving prediction accuracy across diverse species and ontologies. Future extensions of this work will focus on incorporating advanced sequence and structural information to further enhance its performance, especially for proteins with limited annotation data.
Data availability
Data and code are available at https://github.com/husaiccsu/GOHPro.
References
Ruepp, A. et al. The funcat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Res. 32 (18), 5539–5545 (2004).
Radivojac, P. et al. A large-scale evaluation of computational protein function prediction. Nat. Methods. 10 (3), 221–227 (2013).
Schwikowski, B., Uetz, P. & Fields, S. A network of protein–protein interactions in yeast. Nat. Biotechnol. 18 (12), 1257–1261 (2000).
Bogdanov, P. & Singh, A. K. Molecular function prediction using neighborhood features. IEEE/ACM Trans. Comput. Biol. Bioinf. 7 (2), 208–217 (2009).
Friedberg, I. Automated protein function prediction—the genomic challenge. Brief. Bioinform. 7 (3), 225–242 (2006).
Rentzsch, R. & Orengo, C. A. Protein function prediction–the power of multiplicity. Trends Biotechnol. 27 (4), 210–219 (2009).
Jang, Y. J. et al. Accurate prediction of protein function using statistics-informed graph networks. Nat. Commun. 15 (1), 6601 (2024).
Liang, S. et al. A novel function prediction approach using protein overlap networks. BMC Syst. Biol. 7 (1), 61 (2013).
Peng, W. et al. Improving protein function prediction using domain and protein complexes in PPI networks. BMC Syst. Biol. 8 (1), 35 (2014).
Sun, P. et al. Protein function prediction using function associations in protein–protein interaction network. IEEE Access. 6, 30892–30902 (2018).
Liu, Y. W. et al. GODoc: high-throughput protein function prediction using novel k-nearest-neighbor and voting algorithms. BMC Bioinform. 21, 1–16 (2020).
Kulmanov, M., Zhapa-Camacho, F. & Hoehndorf, R. DeepGOWeb: fast and accurate protein function prediction on the (Semantic) web. Nucleic Acids Res. 49 (W1), W140–W146 (2021).
Sarker, B. et al. Improving automatic GO annotation with semantic similarity. BMC Bioinform. 23 (Suppl 2), 433 (2022).
Di Persia, L. et al. exp2GO: improving prediction of functions in the gene ontology with expression data. IEEE/ACM Trans. Comput. Biol. Bioinf. 20 (2), 999–1008 (2022).
Ibtehaz, N., Kagaya, Y. & Kihara, D. Domain-PFP allows protein function prediction using function-aware domain embedding representations. Commun. Biology. 6 (1), 1103 (2023).
Zhao, B. et al. An iteration method for identifying yeast essential proteins from heterogeneous network. BMC Bioinform. 20, 1–13 (2019).
Zhao, B. et al. NPF: network propagation for protein function prediction. BMC Bioinform. 21, 1–21 (2020).
Hu, S. et al. Protein function annotation based on heterogeneous biological networks. BMC Bioinform. 23 (1), 493 (2022).
Valentini, G. True path rule hierarchical ensembles for genome-wide gene function prediction. IEEE/ACM Trans. Comput. Biol. Bioinf. 8 (3), 832–847 (2010).
Yang, C. et al. Heterogeneous network representation learning: A unified framework with survey and benchmark. IEEE Trans. Knowl. Data Eng. 34 (10), 4854–4873 (2020).
Wang, Y. et al. Information propagation dynamics on heterogeneous-homogeneous coupling bi-layer networks. Sci. Rep. 14 (1), 30766 (2024).
Meldal, B. H. M. et al. Complex Portal. : new curation frontiers. Nucleic acids research, 50(D1): D578-D586 (2022). (2022).
Zhao, C. & Wang, Z. GOGO: an improved algorithm to measure the semantic similarity between gene ontology terms. Sci. Rep. 8 (1), 15107 (2018).
Kulmanov, M. et al. Deepgo-se: Protein function prediction as approximate semantic entailment. bioRxiv, 09. 26.559473 (2023). (2023).
Oughtred, R. et al. The biogrid interaction database: 2019 update. Nucleic Acids Res. 47 (D1), D529–D541 (2019).
UniProt Consortium UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47(D1), D506–D515 (2019).
Gene Ontology Consortium. The gene ontology resource: 20 years and still going strong. Nucleic Acids Res. 47 (D1), D330–D338 (2019).
Mistry, J. et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 49 (D1), D412–D419 (2021).
Tawfiq, R. et al. DeepGOMeta for functional insights into microbial communities using deep learning-based protein function prediction. Sci. Rep. 14 (1), 31813 (2024).
Zhou, N. et al. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biol. 20, 1–23 (2019).
Wong, T. T. Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recogn. 48 (9), 2839–2846 (2015).
Koehler Leman, J. et al. Sequence-structure-function relationships in the microbial protein universe. Nat. Commun. 14 (1), 2351 (2023).
Pedrosa, A. G. et al. Peroxisomal monoubiquitinated PEX5 interacts with the AAA ATPases PEX1 and PEX6 and is unfolded during its dislocation into the cytosol. J. Biol. Chem. 293 (29), 11553–11563 (2018).
Fujiki, Y. et al. Dynamic and functional assembly of the AAA peroxins, Pex1p and Pex6p, and their membrane receptor Pex26p involved in shuttling of the PTS1 receptor Pex5p in peroxisome biogenesis. Biochem. Soc. Trans. 36 (1), 109–113 (2008).
Fielden, J. et al. TEX264 coordinates p97-and SPRTN-mediated resolution of topoisomerase 1-DNA adducts. Nat. Commun. 11 (1), 1274 (2020).
Krastev, D. B. et al. The ubiquitin-dependent ATPase p97 removes cytotoxic trapped PARP1 from chromatin. Nat. Cell Biol. 24 (1), 62–73 (2022).
Asano, T. et al. Interaction between ORC and Cdt1p of Saccharomyces cerevisiae. FEMS Yeast Res. 7 (8), 1256–1262 (2007).
Nolden, M. et al. The m-AAA protease defective in hereditary spastic paraplegia controls ribosome assembly in mitochondria. Cell 123 (2), 277–289 (2005).
Arlt, H. et al. The YTA10–12 complex, an AAA protease with chaperone-like activity in the inner membrane of mitochondria. Cell 85 (6), 875–885 (1996).
McHugh, M. L. Multiple comparison analysis testing in ANOVA. Biochemia Med. 21 (3), 203–209 (2011).
Kim, H. Y. Statistical notes for clinical researchers: post-hoc multiple comparisons. Restor. Dentistry Endodontics. 40 (2), 172–176 (2015).
Acknowledgements
This work is supported in part by the National Natural Science Foundation of China (No. 61772089), Natural Science Foundation of Hunan Province (No. 2023 JJ30071), National Scientific Research Foundation of Hunan Province (No. 22 A0591), Hunan Provincial Key Laboratory of Industrial Internet Technology and Security (2019 TP1011).
Author information
Authors and Affiliations
Contributions
BHZ obtain the protein-protein interaction data, protein domain profile, Gene Ontology and GO annotations, and the protein complexes information. SH designed the new method, GOHPro, and analysed the results. SH and BHZ draft the manuscript together. All authors have read and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Hu, S., Zhao, B. Protein function prediction using GO similarity-based heterogeneous network propagation. Sci Rep 15, 19131 (2025). https://doi.org/10.1038/s41598-025-04933-1
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-04933-1











