Fig. 1: Features of the 3691 long insertions (TMMINSs).
From: Construction of JRG (Japanese reference genome) with single-molecule real-time sequencing

a Distribution of the 3691 insertions in the chromosomes. The red lines on the chromosomes indicate the locations of the insertions on each chromosome. The gray bands in each chromosome indicate its cytobands. b Distribution of the lengths of TMMINSs. The two prominent peaks correspond to Alus and LINEs. Left inner box: distribution of GC ratios accompanied by entropy information. Right inner box: TMMINSs with high entropy tended to show medium GC ratios of ~0.5. The distribution of entropy is accompanied by the GC ratio information. The peaks in the high-entropy region indicated that many TMMINSs had high complexity. c Repeat motif enrichment analysis of TMMINSs. The boxplot indicates the background distribution of the total number of motif classes. Each box represents the 25th and 75th percentiles of the total number of each motif. The notches represent the 1.5 × interquartile range. The red dots outside the notches indicate the enriched motif classes in TMMINSs. The other black dots show outliers. d The relative frequencies of nonreference alleles of TMMINSs, SNVs, and short indels are indicated as green, red, and blue lines, respectively. The nonreference allele frequencies of each variant were calculated from the genotypes of 1070 individuals, and only variants found in JPN00001 were used. e Repeat categories and allele frequencies of TMMINSs in 1KJPN. The horizontal axis shows the allele frequencies of TMMINS in 1KJPN, and the vertical axis shows the occupancy ratio of repeat motifs in TMMINS