Extended Data Fig. 4: VNTR genotyping using vamos. | Nature

Extended Data Fig. 4: VNTR genotyping using vamos.

From: Structural variation in 1,019 diverse humans based on long-read sequencing

Extended Data Fig. 4

a-c) Density plots comparing the range difference of repeat unit (RU) counts for different percentile ranges for VNTRs genotyped from our resource (“ONT”) and from multi-platform whole genome assemblies20 (“HGSVC”), using vamos (plotted range is restricted to data points with x < 100 and y < 100 for visualisation purposes). Plots show guide lines for y = x +/− c with c = 5, 20, 50 for visualizing ranges as shown in the legend below (a-c). Higher c values indicate more extreme cases where one dataset reports higher RU ranges compared to the other (Note S5 and Table S31). d-e) Distribution of the base pair lengths and the count of RUs in the VNTR alleles genotyped by vamos on the ONT data and on the HGSVC assemblies. We depict the distribution of two VNTR loci found in the genes ABCA7 (chr19:1,012,105-1,014,401) in (d) and PLIN4 (chr19:4,494,323-4,497,243) in (e), which have been associated with late-onset human disease102,103. For the ABCA7 VNTR locus, alleles of a length greater than 5,720 bp (denoted through a dashed vertical line in (d)) are associated with late-onset disease, whereas for the PLIN4 VNTR locus, alleles with repeat count of 40 (denoted as a dashed vertical line in (e)) are disease-associated. We identify a 43 RU count VNTR allele for the PLIN4 locus in sample NA20127 (outlier denoted with an arrow), with this RU count confirmed using manual inspection (Fig. S62).

Back to article page