Extended Data Fig. 4: Heterozygosity in structural variation.
From: From genotype to phenotype with 1,086 near telomere-to-telomere yeast genomes

All figures in this panel are based on 396 heterozygous isolates for which phased assemblies were constructed. a. Number of SV per isolate, colored by zygosity. b. Distribution of the fraction of heterozygous SVs per isolate. c. Spearman correlation between the fraction of heterozygous SVs and the SNP heterozygosity per isolate, computed as the number of heterozygous loci over the total number of callable positions. d. Heterozygosity level (defined as the frequency at which an SV is found heterozygous) between SVs located in subtelomeric and non-subtelomeric regions. The middle bar of the box plots corresponds to the median; the upper and lower bounds correspond to the third and first quartiles, respectively. The whiskers correspond to the upper and lower bounds 1.5 times the interquartile range (IQR). P value was calculated using two-sided Mann-Whitney-Wilcoxon test (**** indicates P < 2.2 × 10−16). e. Heterozygosity level per type of SV. The middle bar of the box plots corresponds to the median; the upper and lower bounds correspond to the third and first quartiles, respectively. The whiskers correspond to the upper and lower bounds 1.5 times the interquartile range (IQR). Letters discriminate groups between which a two-sided Mann-Whitney-Wilcoxon test with FDR correction is significant with P < 0.05. f. Average heterozygosity level according to SVs length (1 kb bins). The upper and lower bounds indicate the mean plus or minus the standard deviation, respectively. Spearman correlation between the length and heterozygosity level of SVs was computed. g. SV length according to the type of SV. The middle bar of the box plots corresponds to the median; the upper and lower bounds correspond to the third and first quartiles, respectively. The whiskers correspond to the upper and lower bounds 1.5 times the interquartile range (IQR). Letters discriminate groups between which a two-sided Mann-Whitney-Wilcoxon test with FDR correction is significant with P < 0.05. The bimodal distribution of PAVs length reflects the presence of SVs associated with Ty elements ( ~ 6 kb) and solo long terminal repeats (LTR, ~300 bp).