Figure 4

(a) Complete assembly of the risk and non-risk MUC1-VNTR alleles from positive tested families F1–9. For all families (F1–9), the structure of the risk MUC1-VNTR allele and the exact position of the single mutant repeat unit (with exception of family 3) and their sequence context were determined, respectively. Representative assembly of the VNTR as a series of 60mer units covering hg19 chr1 positions 155,160,963 to 155,162,030 (inclusive), and oriented relative to the MUC1 coding strand (Human GRCH37/hg19; negative strand). Repeat units shown in red contain the insertion of an additional C into the seven C-stretch sequence appearing at relative positions 53–59 of a single repeat unit. Non-risk alleles are shown for all individuals. Uniform pseudo-repeat units 1–5 and 6–9 encompassing the variable repeat region are underlined. The assembly of the hypervariable VNTR is not arbitrary, but rather follows uniform patterns where certain unit stretches were conserved in all individuals between families regardless of the VNTR allele size. Uniform repeat type stretches are highlighted with different colours. (b) Complete assembly of both MUC1-VNTR alleles from negative tested families F10–16. For all families (F10–16) the structure of both MUC1 VNTR alleles were determined. Exemplary depiction of VNTR assembly as a series of 60mer units covering hg19 chr1 positions 155,160,963 to 155,162,030 (inclusive), and oriented relative to the MUC1 coding strand (hg 19 negative strand). Uniform pseudo-repeat units 1–5 and 6–9 encompassing the variable repeat region are underlined. Uniform repeat type stretches are highlighted with different colours. In the asymptomatic individual III-7 of F12 the fixed unit 6′ is lacking the last 18 bases (marked with an *) and unit 7 is completely missing.