Fig. 2: Inference on the origin of the inserted sequence by the two hypotheses.

a Differences in the structure between the reference (top) and observed (bottom) sequences. Light green and purple sequences indicate the reference and inserted sequences, respectively. The unresolved mismatch (G allele) around the breakpoint of PTENα is highlighted by an orange box. b Hypothesis 1 for the origin of the unresolved G allele. Hypothesis 1 assumes a 46-bp deletion and 67-bp insertion, in which the G allele arises from an alteration in the PTENα sequence by an SNP (rs1007956565, A/C). There are two possible sequences by the SNP (rs1007956565) at the breakpoint of PTENα; however, these two sequences cannot account for the unresolved mismatch (G allele). c Hypothesis 2 for the origin of the unresolved G allele. Hypothesis 2 assumes a 47-bp deletion and 68-bp insertion, in which the G allele arises from an alteration within the inserted sequence (purple). The two candidate sources of the inserted sequence are as follows: (i) a reverse complement of a region (chr1:569503ā569570) within a nuclear mitochondrial sequence (chr1:564465ā570304) and (ii) a reverse complement of a part of the mitochondrial genome (chrM: 8955ā9022). These two candidate regions have identical sequences; however, there is a SNP (rs1198320487, A/G on the reverse strand) in the nuclear mitochondrial sequence on chromosome 1. One of the two possible sequences by the SNP (rs1198320487) can account for the unresolved mismatch (G allele). As a result, the source of the inserted sequence is likely to be the reverse complement of a region (chr1:569503ā569570) within the nuclear mitochondrial sequence (chr1:564465ā570304) with the alternative G allele at the SNP rs1198320487 site