Abstract
The precise diagnosis of cancer type based on microarray data is of particular importance and is also a challenging task. We have devised a novel pattern recognition procedure based on independent component analysis (ICA). Different from the conventional cancer classification methods, which are limited in their clinical applicability of cancer diagnosis, our method extracts explicitly, by ICA algorithm, a set of specific diagnostic patterns of normal and tumor tissues corresponding to a set of biomarkers for clinical use. We validated our procedure with the colon and prostate cancer data sets and achieved good diagnosis (>90%) on the data sets studied here. This technique is also suitable for the identification of diagnostic expression patterns for other human cancers and demonstrates the feasibility of simple and accurate molecular cancer diagnostics for clinical implementation.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Ramaswamy S, Tamayo P, Rifkin R et al: Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci USA 2001; 98: 15149–15154.
Simon R, Radmacher MD, Dobbin K, McShane LM : Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst 2003; 95: 14–18.
Hyvarinen A, Oja E : Independent component analysis: algorithm and applications. Neural Netw 2000; 13: 411–430.
Stone JP, Porrill J, Porter NR, Wilkinson ID : Spatiotemporal independent component analysis of event-related fMRI data using skewed probability density functions. Neuroimage 2002; 15: 407–421.
Vigario R, Sarela J, Jousmaki V, Hamalainen M, Oja E : Independent component approach to the analysis of EEG and MEG recordings. IEEE Trans Biomed Eng 2000; 47: 589–593.
Vigario RN : Extraction of ocular artefacts from EEG using independent component analysis. Electroencephalogr Clin Neurophysiol 1997; 103: 395–404.
Liebermeister W : Linear modes of gene expression determined by independent component analysis. Bioinformatics 2002; 18: 51–60.
Lee SI, Batzoglou S : Application of independent component analysis to microarrays. Genome Biol 2003; 4: R76.
Saidi SA, Holland CM, Kreil DP et al: Independent component analysis of microarray data in the study of endometrial cancer. Oncogene 2004; 23: 6677–6683.
Alon U, Barkai N, Notterman DA et al: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 1999; 96: 6745–6750.
Singh D, Febbo PG, Ross K et al: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 2002; 1: 203–209.
Welsh JB, Sapinoso LM, Su AI et al: Analysis of gene expression identifies candidate markers and pharmacological targets in prostate cancer. Cancer Res 2001; 61: 5974–5978.
Kishino H, Waddell PJ : Correspondence analysis of genes and tissue types and finding genetic links from microarray data. Genome Inform Ser Workshop Genome Inform 2000; 11: 83–95.
Li L, Darden TA, Weinberg CR, Levine AJ, Pedersen LG : Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor method. Comb Chem High Throughput Screen 2001; 4: 727–739.
Zhang H, Yu CY, Singer B, Xiong M : Recursive partitioning for tumor classification with gene expression microarray data. Proc Natl Acad Sci USA 2001; 98: 6730–6735.
Maier C, Rosch K, Herkommer K et al: A candidate gene approach within the susceptibility region PCaP on 1q42.2–43 excludes deleterious mutations of the PCTA-1 gene to be responsible for hereditary prostate cancer. Eur Urol 2002; 42: 301–307.
Gopalkrishnan RV, Roberts T, Tuli S, Kang D, Christiansen KA, Fisher PB : Molecular characterization of prostate carcinoma tumor antigen-1, PCTA-1, a human galectin-8 related gene. Oncogene 2000; 19: 4405–4416.
Xiong M, Li W, Zhao J, Jin L, Boerwinkle E : Feature (gene) selection in gene expression-based tumor classification. Mol Genet Metab 2001; 73: 239–247.
Nguyen DV, Rocke DM : Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 2002; 18: 39–50.
McLachlan GJ : Discriminant Analysis and Statistical Pattern Recognition. New York: Wiley, 1992.
Li W, Fan M, Xiong M : SamCluster: an integrated scheme for automatic discovery of sample classes using gene expression profile. Bioinformatics 2003; 19: 811–817.
Yap Y, Zhang X, Ling M, Wang X, Wong Y, Danchin A : Classification between normal and tumor tissues based on the pair-wise gene expression ratio. BMC Cancer 2004; 4: 72–88.
Acknowledgements
We thank the Hong Kong Innovation and Technology Fund BIOSUPPORT program for supporting the present research.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supplementary Information accompanies the paper on European Journal of Human Genetics website (http://www.nature.com/ejhg)
Supplementary information
Rights and permissions
About this article
Cite this article
Zhang, X., Yap, Y., Wei, D. et al. Molecular diagnosis of human cancer type by gene expression profiles and independent component analysis. Eur J Hum Genet 13, 1303–1311 (2005). https://doi.org/10.1038/sj.ejhg.5201495
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/sj.ejhg.5201495
Keywords
This article is cited by
-
Machine learning uncovers independently regulated modules in the Bacillus subtilis transcriptome
Nature Communications (2020)
-
The Escherichia coli transcriptome mostly consists of independently regulated modules
Nature Communications (2019)
-
MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics
BMC Bioinformatics (2016)
-
Bacterial adaptation during chronic infection revealed by independent component analysis of transcriptomic data
BMC Microbiology (2011)
-
Independent component analysis of Alzheimer's DNA microarray gene expression data
Molecular Neurodegeneration (2009)