Figure 1: Assembly and variation of 18 genomes of A. thaliana. | Nature

Figure 1: Assembly and variation of 18 genomes of A. thaliana.

From: Multiple reference genomes and transcriptomes for Arabidopsis thaliana

Figure 1

a, Classification of sequence, SNPs and indels based on the Col-0 genome. b, Assembly accuracy (y axis; base substitution errors per 10 kb) measured relative to four validation data sets at each of eight stages in the IMR/DENOM assembly pipeline (x axis). Bur-0 survey (blue line): 1,442 survey sequences (about 417 bp each) in predominantly genic regions19; Bur-0 divergent (red line): 188 sequences (each about 254 bp) highly divergent from Col-0 (ref. 3); Ler-0 nonrepetitive (orange line): a predominantly single-copy 175-kb Ler-0 sequence on chromosome 5; Ler-0 repetitive (purple line): a highly repetitive 339-kb Ler-0 locus on chromosome 3 (ref. 18; Supplementary Information section 4). Iter, iteration. c, Genome-wide distribution of the minimum clade size for all pairs of accessions (excluding Po-0). Each pair is represented by a grey line, the mean over all pairs by the black line and the random distribution by the green line. d, Decay in linkage disequilibrium with distance (Po-0 excluded). The black line shows r2 between SNPs; the red line shows phylogenetic r2 (Supplementary Information section 6).

PowerPoint slide

Back to article page