Multi-algorithm and multi-model based drug target prediction and web server

Liu, Ying-tao; Li, Yi; Huang, Zi-fu; Xu, Zhi-jian; Yang, Zhuo; Chen, Zhu-xi; Chen, Kai-xian; Shi, Ji-ye; Zhu, Wei-liang

doi:10.1038/aps.2013.153

Original Article
Published: 03 February 2014

Multi-algorithm and multi-model based drug target prediction and web server

Ying-tao Liu¹^na1,
Yi Li¹^na1,
Zi-fu Huang¹^na1,
Zhi-jian Xu¹,
Zhuo Yang¹,
Zhu-xi Chen¹,
Kai-xian Chen¹,
Ji-ye Shi² &
…
Wei-liang Zhu¹

Acta Pharmacologica Sinica volume 35, pages 419–431 (2014)Cite this article

2117 Accesses
9 Citations
Metrics details

Abstract

Aim:

To develop a reliable computational approach for predicting potential drug targets based merely on protein sequence.

Methods:

With drug target and non-target datasets prepared and 3 classification algorithms (Support Vector Machine, Neural Network and Decision Tree), a multi-algorithm and multi-model based strategy was employed for constructing models to predict potential drug targets.

Results:

Twenty one prediction models for each of the 3 algorithms were successfully developed. Our evaluation results showed that ∼30% of human proteins were potential drug targets, and ∼40% of putative targets for the drugs undergoing phase II clinical trials were probably non-targets. A public web server named D3TPredictor (http://www.d3pharma.com/d3tpredictor) was constructed to provide easy access.

Conclusion:

Reliable and robust drug target prediction based on protein sequences is achieved using the multi-algorithm and multi-model strategy.

Predicting the toxic side effects of drug interactions using chemical structures and protein sequences

Article Open access 28 December 2024

Computational approach for decoding malaria drug targets from single-cell transcriptomics and finding potential drug molecule

Article Open access 14 October 2024

A machine learning framework for predicting drug–drug interactions

Article Open access 02 September 2021

References

Ohlstein EH, Ruffolo RR, Elliott JD . Drug discovery in the next millennium. Annu Rev Pharmacol Toxicol 2000; 40: 177–91.
Article CAS Google Scholar
Hopkins AL, Groom CR . The druggable genome. Nat Rev Drug Discov 2002; 1: 727–30.
Article CAS Google Scholar
Drews J . Drug discovery: a historical perspective. Science 2000; 287: 1960–4.
Article CAS Google Scholar
Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 2006; 34: D668–72.
Article CAS Google Scholar
Drews J . Genomic sciences and the medicine of tomorrow. Nat Biotechnol 1996; 14: 1516–8.
Article CAS Google Scholar
Overington JP, Al-Lazikani B, Hopkins AL . Opinion — How many drug targets are there? Nat Rev Drug Discov 2006; 5: 993–6.
Article CAS Google Scholar
Butcher SP . Target discovery and validation in the post-genomic era. Neurochem Res 2003; 28: 367–71.
Article CAS Google Scholar
An J, Totrov M, Abagyan R . Comprehensive identification of “druggable” protein ligand binding sites. Genome Inform 2004; 15: 31–41.
CAS PubMed Google Scholar
Russ AP, Lampel S . The druggable genome: an update. Drug Discov Today 2005; 10: 1607–10.
Article Google Scholar
Hardy LW, Peet NP . The multiple orthogonal tools approach to define molecular causation in the validation of druggable targets. Drug Discov Today 2004; 9: 117–26.
Article CAS Google Scholar
Hajduk PJ, Huth JR, Tse C . Predicting protein druggability. Drug Discov Today 2005; 10: 1675–82.
Article CAS Google Scholar
Hajduk PJ, Huth JR, Fesik SW . Druggability indices for protein targets derived from NMR-based screening data. J Med Chem 2005; 48: 2518–25.
Article CAS Google Scholar
Mullner S, Neumann T, Lottspeich F . Proteomics — a new way for drug target discovery. Arzneimittelforschung 1998; 48: 93–5.
CAS PubMed Google Scholar
Han LY, Zheng CJ, Xie B, Jia J, Ma XH, Zhu F, et al. Support vector machines approach for predicting druggable proteins: recent progress in its exploration and investigation of its usefulness. Drug Discov Today 2007; 12: 304–13.
Article CAS Google Scholar
Bakheet TM, Doig AJ . Properties and identification of human protein drug targets. Bioinformatics 2009; 25: 451–7.
Article CAS Google Scholar
Xu H, Lin M, Wang W, Li Z, Huang J, Chen Y, et al. Learning the drug target-likeness of a protein. Proteomics 2007; 7: 4255–63.
Article CAS Google Scholar
Li Q, Lai L . Prediction of potential drug targets based on simple sequence properties. BMC Bioinformatics 2007; 8: 353.
Article Google Scholar
Zhang GL, Khan AM, Srinivasan KN, August JT, Brusic V . Neural models for predicting viral vaccine targets. J Bioinform Comput Biol 2005; 3: 1207–25.
Article CAS Google Scholar
Niwa T . Prediction of biological targets using probabilistic neural networks and atom-type descriptors. J Med Chem 2004; 47: 2645–50.
Article CAS Google Scholar
Nidhi, Glick M, Davies JW, Jenkins JL . Prediction of biological targets for compounds using multiple-category Bayesian models trained on chemogenomics databases. J Chem Inf Model 2006; 46: 1124–33.
Article CAS Google Scholar
Xu H, Fang Y, Yao L, Chen Y, Chen X . Does drug-target have a likeness? Method Inf Med 2007; 46: 360–6.
Article Google Scholar
Huang C, Zhang R, Chen Z, Jiang Y, Shang Z, Sun P, et al. Predict potential drug targets from the ion channel proteins based on SVM. J Theor Biol 2009; 262: 750–6.
Article Google Scholar
Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 2003; 31: 365–70.
Article CAS Google Scholar
Plosker GR . Information strategist — Thomson pharma and infotrieve life science research center: New directions for online aggregators. Online 2006; 30: 47–51.
Google Scholar
Ji ZL, Han LY, Yap CW, Sun LZ, Chen X, Chen YZ . Drug adverse reaction target database (DART): proteins related to adverse drug reactions. Drug Saf 2003; 26: 685–90.
Article CAS Google Scholar
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The protein data bank. Nucleic Acids Res 2000; 28: 235–42.
Article CAS Google Scholar
Bairoch A, Apweiler R, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. The universal protein resource (UniProt). Nucleic Acids Res 2005; 33: D154–9.
Article CAS Google Scholar
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ . Basic local alignment search tool. J Mol Biol 1990; 215: 403–10.
Article CAS Google Scholar
Vanopdenbosch N, Cramer R, Giarrusso FF . Sybyl, the integrated molecular modeling system. J Mol Graph 1985; 3: 110–1.
Google Scholar
Halgren TA . Identifying and characterizing binding sites and assessing druggability. J Chem Inf Model 2009; 49: 377–89.
Article CAS Google Scholar
Chang CC, Lin CJ . LIBSVM: a library for support vector machines. ACM TIST 2011; 2: 1–27.
Article Google Scholar
Rice P, Longden I, Bleasby A . EMBOSS: the european molecular biology open software suite. Trends Genet 2000; 16: 276–7.
Article CAS Google Scholar
Nielsen H, Engelbrecht J, Brunak S, vonHeijne G . Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng 1997; 10: 1–6.
Article CAS Google Scholar
Center for biological sequence analysis [homepage on the Internet]. Technical University of Denmark; c2001–2013 [updated 2013 Jun 5; cited 2013 Jul 19]. Available from: http://www.cbs.dtu.dk/services/NetNGlyc/.
Julenius K, Molgaard A, Gupta R, Brunak S . Prediction, conservation analysis, and structural characterization of mammalian mucin-type O-glycosylation sites. Glycobiology 2005; 15: 153–64.
Article CAS Google Scholar
Li ZR, Lin HH, Han LY, Jiang L, Chen X, Chen YZ . PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res 2006; 34: W32–7.
Article CAS Google Scholar
Kyte J, Doolittle RF . A simple method for displaying the hydropathic character of a protein. J Mol Biol 1982; 157: 105–32.
Article CAS Google Scholar
Dobson PD, Doig AJ . Distinguishing enzyme structures from non-enzymes without alignments. J Mol Biol 2003; 330: 771–83.
Article CAS Google Scholar
Paul SM, Mytelka DS, Dunwiddie CT, Persinger CC, Munos BH, Lindborg SR, et al. How to improve R&D productivity: the pharmaceutical industry's grand challenge. Nat Rev Drug Discov 2010; 9: 203–14.
Article CAS Google Scholar
Kola I, Landis J . Can the pharmaceutical industry reduce attrition rates? Nat Rev Drug Discov 2004; 3: 711–5.
Article CAS Google Scholar
Bateman A, Birney E, Durbin R, Eddy SR, Howe KL, Sonnhammer EL . The Pfam protein families database. Nucleic Acids Res 2000; 28: 263–6.
Article CAS Google Scholar
Lagerstrom MC, Schioth HB . Structural diversity of G protein — coupled receptors and significance for drug discovery. Nat Rev Drug Discov 2008; 7: 339–57.
Article Google Scholar
Chantry D . G protein — coupled receptors: from ligand identification to drug targets. 14–16 October 2002, San Diego, CA, USA. Expert Opin Emerg Drugs 2003; 8: 273–6.
Article Google Scholar
Cohen P . Protein kinases — the major drug targets of the twenty — first century? Nat Rev Drug Discov 2002; 1: 309–15.
Article CAS Google Scholar
Asano T, Ikegaki I, Satoh S, Seto M, Sasaki Y . A protein kinase inhibitor, fasudil (AT-877): A novel approach to signal transduction therapy. Cardiovasc Drug Rev 1998; 16: 76–87.
Article CAS Google Scholar
Garber K . Rapamycin's resurrection: a new way to target the cancer cell cycle. J Natl Cancer Inst 2001; 93: 1517–9.
Article CAS Google Scholar
Schindler T, Bornmann W, Pellicena P, Miller WT, Clarkson B, Kuriyan J . Structural mechanism for STI-571 inhibition of abelson tyrosine kinase. Science 2000; 289: 1938–42.
Article CAS Google Scholar
Sebolt-Leopold JS, Dudley DT, Herrera R, Van Becelaere K, Wiland A, Gowan RC, et al. Blockade of the MAP kinase pathway suppresses growth of colon tumors in vivo. Nat Med 1999; 5: 810–6.
Article CAS Google Scholar
Senderowicz AM . Small molecule modulators of cyclin-dependent kinases for cancer therapy. Oncogene 2000; 19: 6600–6.
Article CAS Google Scholar
Morin MJ . From oncogene to drug: development of small molecule tyrosine kinase inhibitors as anti-tumor and anti-angiogenic agents. Oncogene 2000; 19: 6574–83.
Article CAS Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (81273435 and 21021063), National Science & Technology Projects (2012ZX09301001-004, 2012AA01A305, and 2013ZX09103001-001). Computational resources were provided by supercomputer TianHe-I in Tianjin and the Shanghai Supercomputing Center (SCC). The authors thank the developers of free and/or open source software for academic use, including SignalP-3.0, netOglyc-3.1d, netNglyc-1.0, tmhmm-2.0c and EMBOSS-6.0.1.

Author information

Ying-tao Liu, Yi Li and Zi-fu Huang: The first three authors contributed equally to this work.

Authors and Affiliations

Drug Discovery and Design Center, Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China
Ying-tao Liu, Yi Li, Zi-fu Huang, Zhi-jian Xu, Zhuo Yang, Zhu-xi Chen, Kai-xian Chen & Wei-liang Zhu
Informatics Department, UCB Pharma, 216 Bath Road, Slough, SL1 4EN, UK
Ji-ye Shi

Authors

Ying-tao Liu
View author publications
Search author on:PubMed Google Scholar
Yi Li
View author publications
Search author on:PubMed Google Scholar
Zi-fu Huang
View author publications
Search author on:PubMed Google Scholar
Zhi-jian Xu
View author publications
Search author on:PubMed Google Scholar
Zhuo Yang
View author publications
Search author on:PubMed Google Scholar
Zhu-xi Chen
View author publications
Search author on:PubMed Google Scholar
Kai-xian Chen
View author publications
Search author on:PubMed Google Scholar
Ji-ye Shi
View author publications
Search author on:PubMed Google Scholar
Wei-liang Zhu
View author publications
Search author on:PubMed Google Scholar

Corresponding authors

Correspondence to Ji-ye Shi or Wei-liang Zhu.

Additional information

Dataset S1. Classification of predicted targets from 20 025 proteins in the human proteome using the multi-algorithm and multi-model strategy presented in this work (EXCEL).

Tables S1–S8. Quantitative results with tables from Figures 2, 3, 4, 5, 6, 7, and 9 (DOC).

Supplementary information is available at Acta Pharmacologica Sinica's website.

Supplementary information

Tables S1–S8 (download DOC )

which are the quantitative results with tables of Figures 2, 3, 4, 5, 6, 7 and 9 (DOC 254 kb)

Dataset S1 (download XLS )

The first sheet, namely Full Targets, lists the classification of predicted full targets, and the second sheet, namely Quasi Targets, lists the classification of predicted quasi targets (XLS 818 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Yt., Li, Y., Huang, Zf. et al. Multi-algorithm and multi-model based drug target prediction and web server. Acta Pharmacol Sin 35, 419–431 (2014). https://doi.org/10.1038/aps.2013.153

Download citation

Received: 20 July 2013
Accepted: 23 September 2013
Published: 03 February 2014
Issue date: March 2014
DOI: https://doi.org/10.1038/aps.2013.153

Keywords

This article is cited by

Rare Diseases: Drug Discovery and Informatics Resource
- Mingzhu Zhao
- Dong-Qing Wei
Interdisciplinary Sciences: Computational Life Sciences (2018)

Multi-algorithm and multi-model based drug target prediction and web server

Abstract