Extended Data Fig. 9: Patient Genome Analysis. | Nature

Extended Data Fig. 9: Patient Genome Analysis.

From: Structural variation in 1,019 diverse humans based on long-read sequencing

Extended Data Fig. 9

a) Comparison between SV callsets from ‘rare disease patient A’ generated by PAV and Sniffles, the phased VCF panel of HPRC_mg_44 + 966 and the HGSVC assembly-based SV callset20, showing 160 SVs exclusive to the patient genome. b) Allele frequency distributions (log-scale) shown for SV alleles from our study matching those from rare disease patient A, for SVs found both in SAGA and HGSVC (top) and SVs found only in SAGA (bottom). c) Comparison of the number of SVs reported (1) by the pbsv caller (Note S9), (2) by DELLY using default settings, and (3) by DELLY when graph-based filtering is utilised. The median number of SVs detected in 31 rare disease patient genomes are indicated alongside the data points. The comparably high SV count in one patient sample (P1-D11; light orange) is likely attributable to population ancestry. d) An upset plot indicating the number of pathogenic SVs found by DELLY, along with the number of pathogenic SVs retained in graph-based filtering mode (‘delly-pg’). e-f) Integrative Genomics Viewer (IGV) views of the 2 validated pathogenic SVs filtered in pangenome mode. e) A ~ 140 bp insertion in an STR in FMR1 called by DELLY (second row), but not retained in the DELLY-pangenome mode (third row). The length of this multiallelic STR varies in the population, with insert sizes beyond ~450 bp driving the fragile X syndrome104. f) A ~ 47 kbp deletion encompassing two regulatory conserved non-coding elements (CNEs) of SHOX is called by DELLY (second row), but not retained in DELLY-pangenome (third row). Variants in the SHOX CNEs exhibit recurrence and incomplete penetrance, consistent with the occasional presence of this SV in the general population (Note S9).

Back to article page