Fig. 5: Genetic variation, ROH and candidate disease-causing variants.

a, SV count against CHM13 and GRCh38 for each child assembly haplotype. b, Count of SV variants (deletions and insertions) in the family trios called against CHM13 and found to be absent from the HPRC dataset, highlighting their spread across intergenic, intronic/UTR and exonic regions (top), repetitive regions (middle) and segmental duplications (bottom). c, Box plot showing median counts of variants per MB relative to African segments in the same participants aggregated per family (n = 15), for various ancestries. d, Cumulative sizes of long and medium ROH of the ME assemblies and the Yoruba 1KG trio. e, Location and count of genes within the long ROH segments for chromosomes 6 and 12 of the Jordanian father. f, Cumulative number of genes (pLI > 0.9) over contigs per child assembly. g, Candidate disease-causing variants in the probands. Shown are the variants, impacted genes, ascertained phenotypes in the child participants and associated details. The comments column indicates whether the variant was identified with read-based calling. Exonic deletions are denoted by an asterisk on the bars. SD, segmental duplication; HPO, Human Phenotype Ontology; Au, autism; CRD, cystic renal dysplasia; DCS, duplicated collecting system; GD, gait disturbance; GI, glaucoma; GDD, global developmental delay; ID, intellectual disability; MRC, multiple renal cysts; S, seizure; T, tall stature; P, pathogenic; LP, likely pathogenic.