Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Structural genomics and its importance for gene function analysis

Abstract

Structural genomics projects aim to solve the experimental structures of all possible protein folds. Such projects entail a conceptual shift from traditional structural biology in which structural information is obtained on known proteins to one in which the structure of a protein is determined first and the function assigned only later. Whereas the goal of converting protein structure into function can be accomplished by traditional sequence motif-based approaches, recent studies have shown that assignment of a protein's biochemical function can also be achieved by scanning its structure for a match to the geometry and chemical identity of a known active site. Importantly, this approach can use low-resolution structures provided by contemporary structure prediction methods. When applied to genomes, structural information (either experimental or predicted) is likely to play an important role in high-throughput function assignment.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1

Similar content being viewed by others

References

  1. Clark, M.S. Comparative genomics: the key to understanding the Human Genome Project. Bioessays 21, 121–130 ( 1999).

    Article  CAS  Google Scholar 

  2. DellaPenna, D. Nutritional genomics: manipulating plant micronutrients to improve human health . Science 285, 375–379 (1999).

    Article  CAS  Google Scholar 

  3. Wiley, S.R. Genomics in the real world. Curr. Pharm. Des. 4, 417–422 (1998).

    CAS  PubMed  Google Scholar 

  4. Lin, J. et al. Whole-genome shotgun optical mapping of Deinococcus radiodurans . Science 285, 1558– 1562 (1999).

    Article  CAS  Google Scholar 

  5. Carulli, J. P. et al. High throughput analysis of differential gene expression. J. Cell Biochem. Suppl. 31, 286–96 (1998).

    Article  Google Scholar 

  6. Chothia, C. & Finkelstein, A. The classification and origins of protein folding patterns. Annu. Rev. Biochem. 59 , 1007–1039 (1990).

    Article  CAS  Google Scholar 

  7. Murzin, A.G., Lesk, A.M. & Chothia, C. Principles determining the structure of beta-sheet barrels in proteins. II. The observed structures. J. Mol. Biol. 236, 1382–1400 (1994).

    Article  CAS  Google Scholar 

  8. Chothia, C., Hubbard, T., Brenner, S., Barns, H. & Murzin, A. Protein folds in the all-beta and all-alpha classes. Annu. Rev. Biophys. Biomol. Struct. 26, 597– 627 (1997).

    Article  CAS  Google Scholar 

  9. Sali, A. 100,000 protein structures for the biologist (see comments). Nat. Struct. Biol. 5, 1029–1032 ( 1998).

    Article  CAS  Google Scholar 

  10. Holm, L. & Sander, C. Protein folds and families: sequence and structure alignments. Nucleic Acids Res. 27, 244–247 (1999).

    Article  CAS  Google Scholar 

  11. Dodge, C., Schneider, R. & Sander, C. The HSSP database of protein structure-sequence alignments and family profiles. Nucleic Acids Res. 26, 313–315 (1998).

    Article  CAS  Google Scholar 

  12. Holm, L. & Sander, C. Dali/FSSP classification of three-dimensional folds. Nucleic Acids Res. 25, 231– 234 (1997).

    Article  CAS  Google Scholar 

  13. Orengo, C.A. et al. CATH—a hierarchic classification of protein domain structures . Structure 5, 1093–1108 (1997).

    Article  CAS  Google Scholar 

  14. Murzin, A.G., Brenner, S.E., Hubbard, T. & Chothia, C. Scop: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995).

    CAS  PubMed  Google Scholar 

  15. Sanchez, R. & Sali, A. Evaluation of comparative protein structure modeling by MODELLER-3. Proteins Suppl. 50– 58 (1997).

  16. Briem, H. & Kuntz, I.D. Molecular similarity based on DOCK-generated fingerprints. J. Med. Chem. 39, 3401– 3408 (1996).

    Article  CAS  Google Scholar 

  17. Fetrow, J.S. & Skolnick, J. Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/ thioredoxins and T1 ribonucleases. J. Mol. Biol. 281, 949–968 (1998).

    Article  CAS  Google Scholar 

  18. Fetrow, J.S., Godzik, A. & Skolnick, J. Functional analysis of the Escherichia coli genome using the sequence-to-structure-to-function paradigm: identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity . J. Mol. Biol. 282, 703– 711 (1998).

    Article  CAS  Google Scholar 

  19. Fetrow, J.S., Siew, N. & Skolnick, J. Structure-based functional motif identifies a potential disulfide oxidoreductase active site in the serine/threonine protein phosphatase-1 subfamily. FASEB J. 13, 1866– 1874 (1999).

    Article  CAS  Google Scholar 

  20. Zhang, L., Godzik, A., Skolnick, J. & Fetrow, J.S. Functional analysis of E. coli proteins for members of the a/b hydrolase family. Folding and Design 3, 535–548 (1998).

    Article  CAS  Google Scholar 

  21. Orengo, C.A., Todd, A.E. & Thornton, J.M. From protein structure to function. Curr. Opin. Struct. Biol. 9, 374–382 (1999).

    Article  CAS  Google Scholar 

  22. Montelione, G.T. & Anderson, S. Structural genomics: keystone for a Human Proteome Project (news). Nat. Struct. Biol. 6, 11–12 (1999 ).

    Article  CAS  Google Scholar 

  23. Kim, S.H. Shining a light on structural genomics. Nat. Struct. Biol. 5 Suppl, 643–645 ( 1998).

    Article  CAS  Google Scholar 

  24. Gaasterland, T. Structural genomics taking shape. Trends Genet. 14, 135 (1998).

    Article  CAS  Google Scholar 

  25. Sanchez, R. & Sali, A. Large-scale protein structure modeling of the Saccharomyces cerevisiae genome. Proc. Natl. Acad. Sci. USA 95, 13597–13602 (1998).

    Article  CAS  Google Scholar 

  26. Terwilliger, T.C. & Berendzen, J. Automated MAD and MIR structure solution. Acta Crystallogr. D 55, 849–861 (1999).

    Article  CAS  Google Scholar 

  27. Wallin, E. & Heijne, G.V. Genome-wide analysis of intergral membrane proteins from eubacterial, archaen, and eukaryotic organismc. Prot. Sci. 7, 1029–1038 (1998).

    Article  CAS  Google Scholar 

  28. Goffeau, A. et al. Life with 6000 genes (see comments). Science 274, 546, 563–567 ( 1996).

    Article  CAS  Google Scholar 

  29. Elofsson, A. & Sonnhammer, E.L. A comparison of sequence and structure protein domain families as a basis for structural genomics. Bioinformatics 15, 480–500 (1999).

    Article  CAS  Google Scholar 

  30. Rost, B., Schneider, R. & Sander, C. Protein fold recognition by prediction-based threading . J. Mol. Biol. 270, 471– 480 (1997).

    Article  CAS  Google Scholar 

  31. Jones, D.T. GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J. Mol. Biol. 287, 797 –815 (1999).

    Article  CAS  Google Scholar 

  32. Marchler-Bauer, A. & Brenner, S. Comparison of prediction quality in the three CASPs. Proteins Suppl. 3, 218–225 (1999).

    Article  Google Scholar 

  33. Fischer, D. & Eisenberg, D. Assigning folds to the proteins encoded by the genome of Mycoplasma genitalium. Proc. Natl. Acad Sci USA 94, 11929–11934 (1997).

    Article  CAS  Google Scholar 

  34. Kolinski, A., Rotkiewicz, P., Ilkowski, I. & Skolnick, J. A method for the improvement of threading based protein models. Proteins 37, 592–610 ( 1999).

    Article  CAS  Google Scholar 

  35. Lee, J., Liwo, A., Ripoll, D.R., Pillardy, J. & Scheraga, H.A. Calculation of protein conformation by global optimiation of a potential energy function. Proteins Suppl. 3, 204–208 (1999).

    Article  Google Scholar 

  36. Simons, K.T., Bonneau, R., Ruczinski, I. & Baker, D. Ab initio structure prediction of CASP III targets using ROSETTA. Proteins Suppl. 3, 171–176 (1999).

    Article  Google Scholar 

  37. Ortiz, A., Kolinski, A., Rotkiewicz, P., Ilkowski, B. & Skolnick, J. Ab intio folding of proteins using restraints derived from evolutionary information. Proteins Suppl. 3, 177–185 ( 1999).

    Article  Google Scholar 

  38. Osguthorpe, D.J. Improved ab initio predictions with a simplified, flexible geometry model . Proteins Suppl. 3, 186– 193 (1999).

    Article  Google Scholar 

  39. Samudrala, R., Xia, Y., Huang, E. & Levitt, M. Ab initio protein structure prediction using a combined hierarchical approach. Proteins Suppl 3, 194–198 ( 1999).

    Article  Google Scholar 

  40. Orengo, C., Bray, J.E., LoConte, L. & Sillitoe, I. Analysis and assessment of ab initio three-dimensional prediction, secondary structure and contacts prediction. Proteins Suppl. 3, 149–170 (1999).

    Article  Google Scholar 

  41. Murzin, A. Structure classification-based assessement of CASP3 prediction for the fold recognition targets. Proteins Suppl. 3, 88–103 (1999).

    Article  Google Scholar 

  42. Venclovas, C., Zemla, A., Fidelis, K. & Moult, J. Some measures of comparative performance in the three CASPs. Proteins Suppl. 3, 231–227 (1999).

    Article  Google Scholar 

  43. Brutlag, D.L. Genomics and computational molecular biology. Curr. Opin. Microbiol. 1, 340–345 ( 1998).

    Article  CAS  Google Scholar 

  44. Altschul, S.F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    Article  CAS  Google Scholar 

  45. Pearson, W.R. Empirical statistical estimates for sequence similarity searches. J. Mol. Biol. 276, 71–84 (1998).

    Article  CAS  Google Scholar 

  46. Brenner, S.E. Errors in genome annotation. Trends Genet. 15, 132–133 (1999).

    Article  CAS  Google Scholar 

  47. Attwood, T.K. et al. Novel developments with the PRINTS protein fingerprint database . Nucleic Acids Res. 25, 212– 216 (1997).

    Article  CAS  Google Scholar 

  48. Bairoch, A. Prosite: a dictionary of sites and patterns in proteins. Nucleic Acids Res. Suppl, 19, 2241–2245 (1991).

    Article  CAS  Google Scholar 

  49. Henikoff, J.G., Henikoff, S. & Pietrokovski, S. New features of the Blocks database servers. Nucleic Acids Res. 27, 226–228 (1999).

    Article  CAS  Google Scholar 

  50. Hofmann, K., Bucher, P., Falquet, L. & Bairoch, A. The Prosite database, its status in 1999. Nucleic Acids Res. 27, 215–219 (1999).

    Article  CAS  Google Scholar 

  51. Pietrovski, S., Henikoff, J.G. & Henikoff, S. The Blocks database—a system for protein classification . Nucleic Acids Res. 24, 197– 200 (1996).

    Article  Google Scholar 

  52. Yu, L., White, J.V. & Smith, T.F. A homology identification method that combines protein sequence and structure information. Protein Sci. 7, 2499–2510 (1998).

    Article  CAS  Google Scholar 

  53. Kasuya, A. & Thornton, J.M. Three-dimensional structure analyis of Prosite patterns. J. Mol. Biol. 286, 1673–1691 (1999).

    Article  CAS  Google Scholar 

  54. Hegyi, H. & Gerstein, M. The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. J. Mol. Biol. 288, 147– 164 (1999).

    Article  CAS  Google Scholar 

  55. Wallace, A.C., Laskowski, R.A. & Thornton, J.M. Derivation of 3D coordinate templates for searching structural databases: application to Ser-His-Asp catalytic triads in the serine proteinases and lipases. Protein Sci. 5, 1001–1013 (1996).

    Article  CAS  Google Scholar 

  56. Fischer, D., Wolfson, H., Lin, S.L. & Nussinov, R. Three-dimensional, sequence order-independent structural comparison of a serine protease against the crystallographic database reveals active site similarities: potential implications to evolution and to protein folding. Protein Sci. 3, 769–778 ( 1994).

    Article  CAS  Google Scholar 

  57. Matthews, D.A. et al. Structure of human rhinovirus 3C protease reveals a trypsin-like polypeptide fold, RNA-binding site, and means for cleaving precursor polyprotein . Cell 77, 761–771 (1994).

    Article  CAS  Google Scholar 

  58. Zarembinski, T.I. et al. Structure-based assignment of the biochemical function of a hypothetical protein: a test case of structural genomics. Proc. Natl. Acad. Sci. USA 95, 15189–15193 (1998).

    Article  CAS  Google Scholar 

  59. Brenner, S.E., Barken, D. & Levitt, M. The PRESAGE database for structural genomics. Nucleic Acids Res. 27, 251–253 (1999).

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeffrey Skolnick.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Skolnick, J., Fetrow, J. & Kolinski, A. Structural genomics and its importance for gene function analysis. Nat Biotechnol 18, 283–287 (2000). https://doi.org/10.1038/73723

Download citation

  • Received:

  • Accepted:

  • Issue date:

  • DOI: https://doi.org/10.1038/73723

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing