Fig. 2: Population properties of newly identified SVs.
From: The biomedical landscape of genomic structural variation in the qatari population

a Distribution of SVs using the log scale, showing the number of deletions, duplications, insertions, translocations and inversions recovered by our SV consensus calling pipeline. b Size vs number of inferred SVs: most SVs were smaller than 100 kb. c, Most SVs were rare (allele frequency <1% in 86.8% of all SVs). d, e Principal component analysis (PCA) of common SV genotypes and SNP genotypes, shown as PC1 vs PC2. Points are colored by genetic ancestry and shaped by cohort (QGP, TGP). Figure e has been adapted from Razali et al. 20216. f pairwise FST values between populations, with colors ranging from blue (low differentiation, 0) to red (high differentiation, 0.06). g Waterfall plot depicting the SV counts per subpopulation. h, i Summary box plots showing per-sample numbers of homozygous (h) and heterozygous (i) deletions across QGP ancestries (n = 6141), with significant enrichment (determined using the two-sided Wilcoxon test) indicated relative to Peninsular Arabs (light blue) for homozygous deletions and relative to African Arabs (orange) for heterozygous deletions. Boxes show the median (centre line) and 25th–75th percentiles (bounds); whiskers extend to 1.5× IQR; points beyond whiskers are outliers. No error bars are plotted beyond box-plot elements. Pairwise p-values (2 h, vs PAR): PAR vs QGP_ADM, p < 2.2 × 10⁻¹⁶; PAR vs QGP_AFR, p < 2.2 × 10⁻¹⁶; PAR vs QGP_GAR, p < 2.2 × 10⁻¹⁶; PAR vs QGP_SAS, p = 7.71×10⁻¹²; PAR vs QGP_WEP, p < 2.2 × 10⁻¹⁶. Exact p-values (2i, vs AFR): AFR vs QGP_ADM, p < 2.2 × 10⁻¹⁶; AFR vs QGP_PAR, p < 2.2 × 10⁻¹⁶; AFR vs QGP_GAR, p < 2.2 × 10⁻¹⁶; AFR vs QGP_SAS, p < 2.2 × 10⁻¹⁶; AFR vs QGP_WEP, p < 2.2 × 10⁻¹⁶.