Abstract
Aim:
ΦC31 integrase mediates site-specific recombination between two short sequences, attP and attB, in phage and bacterial genomes, which is a promising tool in gene regulation-based therapy since the zinc finger structure is probably the DNA recognizing domain that can further be engineered. The aim of this study was to screen potential pseudo att sites of ΦC31 integrase in the human genome, and evaluate the risks of its application in human gene therapy.
Methods:
TFBS (transcription factor binding sites) were found on the basis of reported pseudo att sites using multiple motif-finding tools, including AlignACE, BioProspector, Consensus, MEME, and Weeder. The human genome with the proposed motif was scanned to find the potential pseudo att sites of ΦC31 integrase.
Results:
The possible recognition motif of ΦC31 integrase was identified, which was composed of two co-occurrence conserved elements that were reverse complement to each other flanking the core sequence TTG. In the human genome, a total of 27924 potential pseudo att sites of ΦC31 integrase were found, which were distributed in each human chromosome with high-risk specificity values in the chromosomes 16, 17, and 19. When the risks of the sites were evaluate more rigorously, 53 hits were discovered, and some of them were just the vital functional genes or regulatory regions, such as ACYP2, AKR1B1, DUSP4, etc.
Conclusion:
The results provide clues for more comprehensive evaluation of the risks of using ΦC31 integrase in human gene therapy and for drug discovery.
Similar content being viewed by others
Log in or create a free account to read this content
Gain free access to this article, as well as selected content from this journal and more on nature.com
or
References
Cavazzana-Calvo M, Thrasher A, Mavilio F . The future of gene therapy. Nature 2004; 427: 779–81.
Kohn DB, Sadelain M, Glorioso JC . Occurrence of leukaemia following gene therapy of X-linked SCID. Nat Rev Cancer 2003; 3: 477–88.
Check E . A tragic setback. Nature 2002; 420: 116–8.
Khan MS, Khalid AM, Malik KA . Phage phiC31 integrase: a new tool in plastid genome engineering. Trends Plant Sci 2005; 10: 1–3.
Groth AC, Olivares EC, Thyagarajan B, Calos MP . A phage integrase directs efficient site-specific integration in human cells. Proc Natl Acad Sci U S A 2000; 97: 5995–6000.
Thyagarajan B, Olivares EC, Hollis RP, Ginsburg DS, Calos MP . Site-specific genomic integration in mammalian cells mediated by phage phiC31 integrase. Mol Cell Biol 2001; 21: 3926–34.
Olivares EC, Hollis RP, Chalberg TW, Meuse L, Kay MA, Calos MP . Site-specific genomic integration produces therapeutic Factor IX levels in mice. Nat Biotechnol 2002; 20: 1124–8.
Stormo GD . DNA binding sites: representation and discovery. Bioinformatics 2000; 16: 16–23.
Benos PV, Bulyk ML, Stormo GD . Additivity in protein-DNA interactions: how good an approximation is it? Nucleic Acids Res 2002; 30: 4442–51.
Liu J, Stormo GD . Quantitative analysis of EGR proteins binding to DNA: assessing additivity in both the binding site and the protein. BMC Bioinformatics 2005; 6: 176.
Combes P, Till R, Bee S, Smith MC . The streptomyces genome contains multiple pseudo-attB sites for the (phi)C31-encoded site-specific recombination system. J Bacteriol 2002; 184: 5746–52.
Held PK, Olivares EC, Aguilar CP, Finegold M, Calos MP, Grompe M . In vivo correction of murine hereditary tyrosinemia type I by phiC31 integrase-mediated gene delivery. Mol Ther 2005; 11: 399–408.
Wingender E, Dietze P, Karas H, Knüppel R . TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res 1996; 24: 238–41.
Sandelin A, Alkema W, Engström P, Wasserman WW, Lenhard B . JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 2004; 32: D91–4.
Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AF, Wootton JC . Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 1993; 262: 208–14.
Cartharius K, Frech K, Grote K, Klocke B, Haltmeier M, Klingenhoff A, et al. MatInspector and beyond: promoter analysis based on transcription factor binding sites. Bioinformatics 2005; 21: 2933–42.
Roth FP, Hughes JD, Estep PW, Church GM . Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nat Biotechnol 1998; 16: 939–45.
Liu X, Brutlag DL, Liu JS . BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pac Symp Biocomput 2001: 127–38.
Stormo GD, Hartzell GW 3rd. Identifying protein-binding sites from unaligned DNA fragments. Proc Natl Acad Sci U S A 1989; 86: 1183–7.
Bailey TL, Elkan C . Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 1994; 2: 28–36.
Pavesi G, Mereghetti P, Mauri G, Pesole G . Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes. Nucleic Acids Res 2004; 32: W199–203.
Thijs G, Lescot M, Marchal K, Rombauts S, De Moor B, Rouzé P, et al. A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics 2001; 17: 1113–22.
Tompa M, Li N, Bailey TL, Church GM, De Moor B, Eskin E, et al. Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 2005; 23: 137–44.
Schones DE, Sumazin P, Zhang MQ . Similarity of position frequency matrices for transcription factor binding sites. Bioinformatics 2005; 21: 307–13.
Bulyk ML, McGuire AM, Masuda N, Church GM . A motif co-occurrence approach for genome-wide prediction of transcription-factor-binding sites in Escherichia coli. Genome Res 2004; 14: 201–8.
Gupta M, Liu JS . De novo cis-regulatory module elicitation for eukaryotic genomes. Proc Natl Acad Sci U S A 2005; 102: 7079–84.
Liu XS, Brutlag DL, Liu JS . An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat Biotechnol 2002; 20: 835–9.
Harbison CT, Gordon DB, Lee TI, Rinaldi NJ, Macisaac KD, Danford TW, et al. Transcriptional regulatory code of a eukaryotic genome. Nature 2004; 431: 99–104.
Jensen ST, Liu JS . BioOptimizer: a Bayesian scoring function approach to motif discovery. Bioinformatics 2004; 20: 1557–64.
Robison K, McGuire AM, Church GM . A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome. J Mol Biol 1998; 284: 241–54.
Schröder AR, Shinn P, Chen H, Berry C, Ecker JR, Bushman F . HIV-1 integration in the human genome favors active genes and local hotspots. Cell 2002; 110: 521–9.
Mitchell RS, Beitzel BF, Schroder AR, Shinn P, Chen H, Berry CC, et al. Retroviral DNA integration: ASLV, HIV, and MLV show distinct target site preferences. PLoS Biol 2004; 2: E234.
Luscombe NM, Austin SE, Berman HM, Thornton JM . An overview of the structures of protein-DNA complexes. Genome Biol 2000; 1: REVIEWS001.
Voziyanov Y, Pathania S, Jayaram M . A general model for site-specific recombination by the integrase family recombinases. Nucleic Acids Res 1999; 27: 930–41.
Kaplan T, Friedman N, Margalit H . Ab initio prediction of transcription factor targets using structural knowledge. PLoS Comput Biol 2005; 1: e1.
Benos PV, Lapedes AS, Stormo GD . Probabilistic code for DNA recognition by proteins of the EGR family. J Mol Biol 2002; 323: 701–27.
Mandel-Gutfreund Y, Margalit H . Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites. Nucleic Acids Res 1998; 26: 2306–12.
Jones S, van Heyningen P, Berman HM, Thornton JM . Protein-DNA interactions: A structural analysis. J Mol Biol 1999; 287: 877–96.
Kelley LA, MacCallum RM, Sternberg MJ . Enhanced genome annotation using structural profiles in the program 3D-PSSM. J Mol Biol 2000; 299: 499–520.
Shi J, Blundell TL, Mizuguchi K . FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J Mol Biol 2001; 310: 243–57.
McGuffin LJ, Bryson K, Jones DT . The PSIPRED protein structure prediction server. Bioinformatics 2000; 16: 404–5.
Karplus K, Barrett C, Hughey R . Hidden Markov models for detecting remote protein homologies. Bioinformatics 1998; 14: 846–56.
Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, et al. The Pfam protein families database. Nucleic Acids Res 2002; 30: 276–80.
Ahumada A, Tse-Dinh YC . The Zn(II) binding motifs of E coli DNA topoisomerase I is part of a high-affinity DNA binding domain. Biochem Biophys Res Commun 1998; 251: 509–14.
Tse-Dinh YC, Beran-Steed RK . Escherichia coli DNA topoisomerase I is a zinc metalloprotein with three repetitive zinc-binding domains. J Biol Chem 1988; 263: 15857–9.
Groth AC, Calos MP . Phage integrases: biology and applications. J Mol Biol 2004; 335: 667–78.
Sclimenti CR, Thyagarajan B, Calos MP . Directed evolution of a recombinase for improved genomic integration at a native human sequence. Nucleic Acids Res 2001; 29: 5044–51.
Crooks GE, Hon G, Chandonia JM, Brenner SE . WebLogo: a sequence logo generator. Genome Res 2004; 14: 1188–90.
Goodstadt L, Ponting CP . CHROMA: consensus-based colouring of multiple alignments for publication. Bioinformatics 2001; 17: 845–6.
Cuff JA, Clamp ME, Siddiqui AS, Finlay M, Barton GJ . JPred: a consensus secondary structure prediction server. Bioinformatics 1998; 14: 892–3.
Ouali M, King RD . Cascaded multiple classifiers for secondary structure prediction. Protein Sci 2000; 9: 1162–76.
Acknowledgements
We thank Fei WANG from the Intelligent Information Processing Lab, Department of Computer Science of Fudan University for her kindly offering of related materials. We also thank Rodolf FLEISCHER, also from Department of Computer Science of Fudan University, for his improvement of the manuscript.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Hu, Zp., Chen, Ls., Jia, Cy. et al. Screening of potential pseudo att sites of Streptomyces phage ΦC31 integrase in the human genome. Acta Pharmacol Sin 34, 561–569 (2013). https://doi.org/10.1038/aps.2012.173
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/aps.2012.173


