Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

A gene ontology inferred from molecular networks

Abstract

Ontologies have proven very useful for capturing knowledge as a hierarchy of terms and their interrelationships. In biology a major challenge has been to construct ontologies of gene function given incomplete biological knowledge and inconsistencies in how this knowledge is manually curated. Here we show that large networks of gene and protein interactions in Saccharomyces cerevisiae can be used to infer an ontology whose coverage and power are equivalent to those of the manually curated Gene Ontology (GO). The network-extracted ontology (NeXO) contains 4,123 biological terms and 5,766 term-term relations, capturing 58% of known cellular components. We also explore robust NeXO terms and term relations that were initially not cataloged in GO, a number of which have now been added based on our analysis. Using quantitative genetic interaction profiling and chemogenomics, we find further support for many of the uncharacterized terms identified by NeXO, including multisubunit structures related to protein trafficking or mitochondrial function. This work enables a shift from using ontologies to evaluate data to using data to construct and evaluate ontologies.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Automated assembly and alignment of gene ontologies.
Figure 2: The NeXO ontology.
Figure 3: Validation.
Figure 4: Evaluation of protein trafficking terms using genetic interaction profiling.
Figure 5: Updating GO with additional terms and term relations.

Similar content being viewed by others

References

  1. Smith, B. et al. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 25, 1251–1255 (2007).

    Article  CAS  Google Scholar 

  2. Musen, M.A. et al. The National Center for Biomedical Ontology. J. Am. Med. Inform. Assoc. 19, 190–195 (2012).

    Article  Google Scholar 

  3. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).

    Article  CAS  Google Scholar 

  4. Fraser, A.G. & Marcotte, E.M. A probabilistic view of gene function. Nat. Genet. 36, 559–564 (2004).

    Article  CAS  Google Scholar 

  5. Leonelli, S., Diehl, A.D., Christie, K.R., Harris, M.A. & Lomax, J. How the gene ontology evolves. BMC Bioinformatics 12, 325 (2011).

    Article  Google Scholar 

  6. Krallinger, M., Leitner, F. & Valencia, A. Analysis of biological processes and diseases using text mining approaches. Methods Mol. Biol. 593, 341–382 (2010).

    Article  CAS  Google Scholar 

  7. Raychaudhuri, S., Chang, J.T., Sutphin, P.D. & Altman, R.B. Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature. Genome Res. 12, 203–214 (2002).

    Article  CAS  Google Scholar 

  8. Pena-Castillo, L. et al. A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol. 9 (suppl.1), S2 (2008).

    Article  Google Scholar 

  9. Buitelaar, P. & Cimiano, P. Ontology Learning and Population: Bridging the Gap between Text and Knowledge, Vol. 167 (IOS Press, Amsterdam, 2008).

  10. Coulet, A., Shah, N.H., Garten, Y., Musen, M. & Altman, R.B. Using text to build semantic networks for pharmacogenomics. J. Biomed. Inform. 43, 1009–1019 (2010).

    Article  CAS  Google Scholar 

  11. Collins, S.R. et al. Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae. Mol. Cell. Proteomics 6, 439–450 (2007).

    Article  CAS  Google Scholar 

  12. Yu, H. et al. High-quality binary protein interaction map of the yeast interactome network. Science 322, 104–110 (2008).

    Article  CAS  Google Scholar 

  13. Tarassov, K. et al. An in vivo map of the yeast protein interactome. Science 320, 1465–1470 (2008).

    Article  CAS  Google Scholar 

  14. Costanzo, M. et al. The genetic landscape of a cell. Science 327, 425–431 (2010).

    Article  CAS  Google Scholar 

  15. Arabidopsis Interactome Mapping Consortium. Evidence for network evolution in an Arabidopsis interactome map. Science 333, 601–607 (2011).

  16. Gasch, A.P. et al. Genomic expression responses to DNA-damaging agents and the regulatory role of the yeast ATR homolog Mec1p. Mol. Biol. Cell 12, 2987–3003 (2001).

    Article  CAS  Google Scholar 

  17. Jansen, R. et al. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302, 449–453 (2003).

    Article  CAS  Google Scholar 

  18. Segal, E. et al. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat. Genet. 34, 166–176 (2003).

    Article  CAS  Google Scholar 

  19. Myers, C.L. et al. Discovery of biological networks from diverse functional genomic data. Genome Biol. 6, R114 (2005).

    Article  Google Scholar 

  20. Lee, I., Li, Z. & Marcotte, E.M. An improved, bias-reduced probabilistic functional gene network of baker′s yeast, Saccharomyces cerevisiae. PLoS ONE 2, e988 (2007).

    Article  Google Scholar 

  21. Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998).

    Article  CAS  Google Scholar 

  22. Girvan, M. & Newman, M.E. Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99, 7821–7826 (2002).

    Article  CAS  Google Scholar 

  23. Zhang, B. & Horvath, S. A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4, Article17 (2005).

  24. Khatri, P. & Draghici, S. Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21, 3587–3595 (2005).

    Article  CAS  Google Scholar 

  25. D'haeseleer, P. How does gene expression clustering work? Nat. Biotechnol. 23, 1499–1501 (2005).

    Article  CAS  Google Scholar 

  26. Gibbons, F.D. & Roth, F.P. Judging the quality of gene expression-based clustering methods using gene annotation. Genome Res. 12, 1574–1581 (2002).

    Article  CAS  Google Scholar 

  27. Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, Z.N. & Barabasi, A.L. Hierarchical organization of modularity in metabolic networks. Science 297, 1551–1555 (2002).

    Article  CAS  Google Scholar 

  28. Dotan-Cohen, D., Letovsky, S., Melkman, A.A. & Kasif, S. Biological process linkage networks. PLoS ONE 4, e5313 (2009).

    Article  Google Scholar 

  29. Tanay, A., Sharan, R., Kupiec, M. & Shamir, R. Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proc. Natl. Acad. Sci. USA 101, 2981–2986 (2004).

    Article  CAS  Google Scholar 

  30. Kelley, R. & Ideker, T. Systematic interpretation of genetic interactions using protein networks. Nat. Biotechnol. 23, 561–566 (2005).

    Article  CAS  Google Scholar 

  31. Jaimovich, A., Rinott, R., Schuldiner, M., Margalit, H. & Friedman, N. Modularity and directionality in genetic interaction maps. Bioinformatics 26, i228–i236 (2010).

    Article  CAS  Google Scholar 

  32. Park, Y. & Bader, J.S. Resolving the structure of interactomes with hierarchical agglomerative clustering. BMC Bioinformatics 12 (suppl.1), S44 (2011).

    Article  Google Scholar 

  33. Clauset, A., Moore, C. & Newman, M.E. Hierarchical structure and the prediction of missing links in networks. Nature 453, 98–101 (2008).

    Article  CAS  Google Scholar 

  34. Jean-Mary, Y.R., Shironoshita, E.P. & Kabuka, M.R. Ontology Matching with Semantic Verification. Web Semant. 7, 235–251 (2009).

    Article  Google Scholar 

  35. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).

    Article  CAS  Google Scholar 

  36. Hillenmeyer, M.E. et al. Systematic analysis of genome-wide fitness data in yeast reveals novel gene function and drug action. Genome Biol. 11, R30 (2010).

    Article  Google Scholar 

  37. Abdulrehman, D. et al. YEASTRACT: providing a programmatic access to curated transcriptional regulatory associations in Saccharomyces cerevisiae through a web services interface. Nucleic Acids Res. 39, D136–D140 (2011).

    Article  CAS  Google Scholar 

  38. Seaman, M.N. Recycle your receptors with retromer. Trends Cell Biol. 15, 68–75 (2005).

    Article  CAS  Google Scholar 

  39. Nickerson, D.P., Brett, C.L. & Merz, A.J. Vps-C complexes: gatekeepers of endolysosomal traffic. Curr. Opin. Cell Biol. 21, 543–551 (2009).

    Article  CAS  Google Scholar 

  40. Peplowska, K., Markgraf, D.F., Ostrowicz, C.W., Bange, G. & Ungermann, C. The CORVET tethering complex interacts with the yeast Rab5 homolog Vps21 and is involved in endo-lysosomal biogenesis. Dev. Cell 12, 739–750 (2007).

    Article  CAS  Google Scholar 

  41. Addinall, S.G. et al. A genomewide suppressor and enhancer analysis of cdc13–1 reveals varied cellular processes influencing telomere capping in Saccharomyces cerevisiae. Genetics 180, 2251–2266 (2008).

    Article  CAS  Google Scholar 

  42. Araragi, S. et al. Mercuric chloride induces apoptosis via a mitochondrial-dependent pathway in human leukemia cells. Toxicology 184, 1–9 (2003).

    Article  CAS  Google Scholar 

  43. Saretzki, G. Telomerase, mitochondria and oxidative stress. Exp. Gerontol. 44, 485–492 (2009).

    Article  CAS  Google Scholar 

  44. Huh, W.K. et al. Global analysis of protein localization in budding yeast. Nature 425, 686–691 (2003).

    Article  CAS  Google Scholar 

  45. Cherry, J.M. et al. Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res. 40, D700–D705 (2012).

    Article  CAS  Google Scholar 

  46. Hayes, M.J., Bryon, K., Satkurunathan, J. & Levine, T.P. Yeast homologues of three BLOC-1 subunits highlight KxDL proteins as conserved interactors of BLOC-1. Traffic 12, 260–268 (2011).

    Article  CAS  Google Scholar 

  47. Clapier, C.R. & Cairns, B.R. The biology of chromatin remodeling complexes. Annu. Rev. Biochem. 78, 273–304 (2009).

    Article  CAS  Google Scholar 

  48. Lu, P.Y., Levesque, N. & Kobor, M.S. NuA4 and SWR1-C: two chromatin-modifying complexes with overlapping functions and components. Biochem. Cell Biol. 87, 799–815 (2009).

    Article  CAS  Google Scholar 

  49. Auger, A. et al. Eaf1 is the platform for NuA4 molecular assembly that evolutionarily links chromatin acetylation to ATP-dependent exchange of histone H2A variants. Mol. Cell. Biol. 28, 2257–2270 (2008).

    Article  CAS  Google Scholar 

  50. van Attikum, H. & Gasser, S.M. The histone code at DNA breaks: a guide to repair? Nat. Rev. Mol. Cell Biol. 6, 757–765 (2005).

    Article  CAS  Google Scholar 

  51. Kouzarides, T. Chromatin modifications and their function. Cell 128, 693–705 (2007).

    Article  CAS  Google Scholar 

  52. Evrin, C. et al. A double-hexameric MCM2–7 complex is loaded onto origin DNA during licensing of eukaryotic DNA replication. Proc. Natl. Acad. Sci. USA 106, 20240–20245 (2009).

    Article  CAS  Google Scholar 

  53. Hong, E.L. et al. Gene Ontology annotations at SGD: new data sources and annotation methods. Nucleic Acids Res. 36, D577–D581 (2008).

    Article  CAS  Google Scholar 

  54. Stark, C. et al. The BioGRID Interaction Database: 2011 update. Nucleic Acids Res. 39, D698–D704 (2011).

    Article  CAS  Google Scholar 

  55. Hubble, J. et al. Implementation of GenePattern within the Stanford Microarray Database. Nucleic Acids Res. 37, D898–D901 (2009).

    Article  CAS  Google Scholar 

  56. Collins, S.R., Roguev, A. & Krogan, N.J. Quantitative genetic interaction mapping using the E-MAP approach. Methods Enzymol. 470, 205–231 (2010).

    Article  CAS  Google Scholar 

  57. Schuldiner, M. et al. Exploration of the function and organization of the yeast early secretory pathway through an epistatic miniarray profile. Cell 123, 507–519 (2005).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We are grateful to J. Bader and Y. Park for assistance with the HAC-ML software and to I. Lee and E. Marcotte for advice on YeastNet. We would also like to thank D. Botstein, C. Boone, R. Sharan and R. Shamir for constructive advice on the manuscript. This work was generously supported by US National Institutes of Health grants P41-GM103504 and P50-GM085764 (TI) and R01-GM084448 and P50-GM081879 (N.J.K.). N.J.K. is a Searle Scholar and Keck Young Investigator.

Author information

Authors and Affiliations

Authors

Contributions

J.D. and T.I. conceived and designed the analysis. J.D. performed initial data analysis, constructed the NeXO ontology and performed all computational experiments. M.K. designed and implemented the ontology alignment procedure with guidance from J.D. M.S. and N.J.K. performed the quantitative genetic interaction profiling and interpreted the data. R.B. and J.M.C. investigated and curated the new ontology terms and relations. J.D. and T.I. wrote the manuscript. All authors contributed to the manuscript and approved its final version.

Corresponding authors

Correspondence to Janusz Dutkowski or Trey Ideker.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Table 1, Supplementary Note and Supplementary Figs. 1–10 (PDF 3024 kb)

Supplementary Table 2

New terms in NeXO (XLSX 189 kb)

Supplementary Table 3

New term-term relations (XLSX 15 kb)

Supplementary Table 4

Genetic interaction profiles for 73 query genes (XLSX 531 kb)

Supplementary Table 5

New NeXO terms enriched within chemogenomic profiles (XLSX 53 kb)

Supplementary Table 6

Integrated high-confidence interaction network (XLSX 1745 kb)

Supplementary File 1

NeXO in Cytoscape Format (File NeXO.cys provided separately) Direct download: http://chianti.ucsd.edu/~janusz/nexo/NeXO.zip (ZIP 7119 kb)

Supplementary File 2

NeXO in Open Biomedical Ontology (OBO) Format (File NeXO.obo provided separately) (ZIP 422 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dutkowski, J., Kramer, M., Surma, M. et al. A gene ontology inferred from molecular networks. Nat Biotechnol 31, 38–45 (2013). https://doi.org/10.1038/nbt.2463

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/nbt.2463

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research