Figure 1
From: Analysis of plant diversity with retrotransposon-based molecular markers

Organization of the LTR retrotransposon genome. The order of coding domains differs between the Copia and Gypsy superfamilies as shown. Retrotransposons are bounded by long terminal repeats (LTRs), which contain the transcriptional promoter and terminator (indicated diagrammatically by a bent arrow and stop sign, respectively). The resultant transcript is indicated as a hatched box between the Gypsy and Copia diagrams. The LTRs contain short inverted repeats at either end (shown as filled triangles). Reverse transcription is primed at the PBS and PPT domains, respectively, for the (−) and (+) strands of the complementary DNA (cDNA). The internal region of the retrotransposon codes for the proteins necessary for the retrotransposon life cycle and is generally divided into two open reading frames: GAG, for the capsid protein, which packages the transcript into a virus-like particle, and POL, for the other proteins. The POL contains: aspartic proteinase (AP), which cleaves the polyprotein; integrase (IN), which inserts the cDNA copy into the genome; reverse transcriptase (RT) and RNaseH (RH), which together copy the transcript into cDNA. An additional open reading frame for the envelope protein (ENV), found in some groups of Gypsy elements, is indicated. The LTRs are generally well conserved within families, and can serve for the design of primers to generate DNA footprints (Figure 2). Direct repeats in the flanking genomic DNA are generated upon retrotransposon integration: these are depicted as short, hatched arrows. The flanking genomic DNA is shown as a wavy line. The apposition of a long element bearing conserved sequences within genomic DNA of random sequence is the basis for retrotransposon marker methods.