Extended Data Fig. 1: Relationship between 5′-flanking sequences and FGF14 GAA repeat lengths.
From: A common flanking variant is associated with enhanced stability of the FGF14-SCA27B repeat locus

a, Schematic representation of the FGF14 gene, isoform 1b with the location of the (GAA)n·(TTC)n repeat locus in the first intron. The sequences of the reference 5′-flanking sequence (5′-RFS), the common 5′-flanking variant (5′-CFV; C4 variant), and the C1, C2, C3, C5, and short flanking variant sequences are shown. The sequences of the C1 through C5 variants are highlighted in blue. The sequences are presented relative to the positive strand (genomic context). b, Swarmplot related to Fig. 1c showing repeat lengths as estimated by PacBio HiFi sequencing for 4,382 alleles (from the Genomic Answers for Kids, Care4Rare-SOLVE, and All of Us cohorts) including each of the C1 through C5 sequence variants, separated into subgroups based on the presence of a single terminal adenine (A) or dual terminal adenines (AA). No alleles with C2AA or C5A 5′-flanking variants were found. Three C4 alleles not counted as part of the 5′-CFV group were observed with a single terminal adenine. This plot also extends the y-axis to show the two alleles of over 800 repeat units carrying the 5′-RFS that were not plotted in Fig. 1c for visual clarity. The color of the data points corresponds to the GAA repeat motif purity (a color legend is shown in the top right corner of the plot). The green dashed horizontal line indicates 30 GAA repeats. Abbreviations: 5′-CFV, common 5′-flanking variant; 5′-RFS, reference 5′-flanking sequence.