Fig. 7: Comparison between the 1000 g-SV and SurVIndel2 datasets.

a, b Sensitivity and the precision for the 34 samples in HGSVC2 for deletions (a) and tandem duplications (b). The dataset produced by SurVIndel2 is both more sensitive and more precise than the 1000 g-SV dataset. c Overlap between deletions in 1000 g-SV and SurVIndel2 datasets. Between parentheses, the validation rates of calls in samples with long reads. Not only does SurVIndel2 cover nearly all of the deletions in 1000 g-SV, but it predicts 130,657 deletions that are not present in the latter. The validation rate for SurVIndel2-unique events is far higher than the validation rate of 1000g-SV unique events. d Similarly, SurVIndel2 predicts a large portion of the tandem duplications in 1000 g-SV, and the ones that are not predicted have a very low validation rate. SurVIndel2 also predicts 197,008 duplications that are not present in 1000 g-SV, with a high validation rate.