Completely resolved structural variants by optical genome mapping with adaptive sampling from CNV discovery

Fu, Li; Kim, Chong Ae; Tokita, Masatoshi; Miyata, Yohane; Okamoto, Nobuhiko; Makita, Yoshio; Osaka, Hitoshi; Fujimoto, Ayataka; Daida, Atsuro; Nirei, Jun; Udagawa, Noriko; Hayakawa, Seiichi; Deguchi, Kimiko; Fukuda, Mitsumasa; Matsumoto, Hiroshi; Akasaka, Manami; Okada, Junichiro; Misumi, Yohei; Kido, Jun; Suzuki, Toshifumi; Aoi, Hiromi; Seyama, Rie; Hamanoue, Haruka; Mitsuhashi, Satomi; Ohori, Sachiko; Saida, Ken; Inoue, Yuta; Hamanaka, Kohei; Tsuchida, Naomi; Uchiyama, Yuri; Koshimizu, Eriko; Miyatake, Satoko; Mizuguchi, Takeshi; Itakura, Atsuo; Harada, Naoki; Fujita, Atsushi; Matsumoto, Naomichi

doi:10.1038/s41525-026-00561-4

Download PDF

Article
Open access
Published: 26 March 2026

Completely resolved structural variants by optical genome mapping with adaptive sampling from CNV discovery

Li Fu^1,2,
Chong Ae Kim³,
Masatoshi Tokita⁴,
Yohane Miyata⁴,
Nobuhiko Okamoto⁵,
Yoshio Makita⁶,
Hitoshi Osaka⁷,
Ayataka Fujimoto⁸,
Atsuro Daida⁹,
Jun Nirei¹⁰,
Noriko Udagawa¹¹,
Seiichi Hayakawa¹²,
Kimiko Deguchi¹³,
Mitsumasa Fukuda¹⁴,
Hiroshi Matsumoto¹⁵,
Manami Akasaka¹⁶,
Junichiro Okada¹⁷,
Yohei Misumi¹⁸,
Jun Kido¹⁹,
Toshifumi Suzuki^2,20,
Hiromi Aoi²,
Rie Seyama²,
Haruka Hamanoue²¹,
Satomi Mitsuhashi¹,
Sachiko Ohori¹,
Ken Saida¹,
Yuta Inoue¹,
Kohei Hamanaka¹,
Naomi Tsuchida^1,22,
Yuri Uchiyama^1,22,
Eriko Koshimizu¹,
Satoko Miyatake^1,21,
Takeshi Mizuguchi¹,
Atsuo Itakura²,
Naoki Harada²³,
Atsushi Fujita¹ &
…
Naomichi Matsumoto^1,21,22

npj Genomic Medicine volume 11, Article number: 26 (2026) Cite this article

2967 Accesses
27 Altmetric
Metrics details

Subjects

Abstract

Structural variants (SVs), including duplications, deletions, inversions, translocations, and insertions, play major roles in human phenotypic diversity but remain difficult to detect because of variable size and structural complexity. Optical genome mapping (OGM) uses ultra-high molecular weight DNA (>150 kb) fluorescently labeled at a specific six-nucleotide sequence, enabling comprehensive SV detection by analyzing labeling patterns along long DNA molecules. This study aimed to fully characterize SVs using OGM. OGM was applied to 30 cases with exome sequencing-based copy number (CN) variants (16 CN losses, 7 CN gains, and 7 combined CN losses and gains). Additionally, targeted Oxford Nanopore long-read sequencing with adaptive sampling was used to determine breakpoints of SVs. This approach revealed undetected SVs in 14 cases (46.7%), and disclosed gene disruptions or CN alterations explaining clinical features in seven cases (23.3%). Even complex SVs involving numerous chromosomal segments and breakpoints were resolved efficiently, highlighting the power of combining OGM and long-read sequencing. Integrating OGM with long-read sequencing improves diagnostic resolution beyond sequencing alone and provides a robust framework for interpreting complex SVs. These findings highlight the potential clinical utility of combining OGM and long-read sequencing as a comprehensive diagnostic strategy for improved precision medicine in rare genetic diseases.

Full characterization of unresolved structural variation through long-read sequencing and optical genome mapping

Article Open access 25 November 2024

Optical genome mapping unveils hidden structural variants in neurodevelopmental disorders

Article Open access 16 May 2024

High-resolution structural variant profiling of myelodysplastic syndromes by optical genome mapping uncovers cryptic aberrations of prognostic and therapeutic significance

Article Open access 01 August 2022

Introduction

Structural variations (SVs)—defined as genomic alterations larger than 50 base pairs (bp) and including duplications, deletions, inversions, translocations, and insertions—constitute the majority of variable nucleotides in the human genome and contribute significantly to phenotypic diversity¹. Although exome sequencing (ES) enables the detection of copy number (CN) variants (CNVs)², it frequently fails to capture SV breakpoints, primarily because of the limited coverage of capture regions and technical limitations such as polymerase chain reaction (PCR) and hybridization biases in GC-rich regions, as well as poor alignment in low-complexity regions.

Even with genome sequencing, SV detection remains challenging because of the wide range of SV types and sizes and their frequent localization within repetitive genomic regions. These complexities hinder both sensitivity and specificity in SV calling, especially for structurally complex and repetitive regions. As a result, many pathogenic SVs likely remain undetected, and the reliable identification of such variants from short-read sequencing data continues to be a major technical and computational challenge^3,4.

Several methodologies have been used for SV detection, including conventional karyotyping, fluorescence in situ hybridization, chromosomal microarray analysis, and long-read sequencing. More recently, optical genome mapping (OGM) has emerged as a promising technique for the comprehensive analysis of SVs. Unlike sequencing-based methods, OGM uses high-resolution fluorescence imaging of ultra-high molecular weight DNA to detect structural changes without nucleotide-level sequencing. Long-read sequencing produces reads typically ranging from 10 to 100 kilobases (kb), whereas OGM analyzes DNA molecules exceeding 150 kb, providing broader genome coverage (>80×), albeit at lower breakpoint resolution^3,5,6.

OGM identifies SVs through the de novo assembly of fluorescently labeled DNA molecules using a specific six-base sequence motif (CTTAAG), which occurs on average every 5 kb across the genome^7,8,9. This method enables the detection of a wide range of structural alterations, including insertions and deletions (>500 bp), duplications (>50 kb), inversions and translocations (>50–70 kb), chromosomal aneuploidy, absence of heterozygosity, triploidy, and copy-neutral events^10,11. In this regard, OGM may outperform traditional sequencing-based platforms for identifying large and complex SVs.

Recent genomic studies have demonstrated that ~0.5–4% of individuals with rare diseases harbor dual or triple molecular diagnoses, in which a single diagnosis explains only part of the observed phenotype¹². These findings emphasize the importance of comprehensive SV discovery beyond CNVs, because additional SVs may be responsible for the unexplained portion of a phenotype. It is therefore critical to identify full SV configurations, rather than mere CN changes, for accurate genotype–phenotype correlations and clinical diagnoses. In the present study, we aimed to use OGM to comprehensively characterize constitutional SVs that were initially identified as CNVs using ES. Furthermore, we used targeted Oxford Nanopore long-read sequencing with adaptive sampling (AS) to resolve the breakpoint structures of these newly discovered SVs at nucleotide resolution.

Results

We applied OGM to 30 cases (Table 1 and Fig. S1; case numbers are listed together with the Saphyr serial numbers assigned by our laboratory) and identified only the expected ES-identified CNVs in 16 cases (Cases 1, 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 14, 16, 19, 25, and 28). By contrast, previously undetected SVs were revealed in the remaining 14 cases (Fig. S1 and Tables 2 and 3). The newly discovered SVs in the latter 14 cases included: three translocations (Cases 15, 17, and 26), three direct duplications (Cases 7, 18, and 24), one duplication with inversion (Case 6), two inverted duplications (Cases 23 and 27), three intrachromosomal insertion events with inverted duplications (Cases 20, 21, and 22), one chromoplexy-like event (Case 29), and one chromothripsis (Case 30). The variant allele frequency (VAF), confidence, and supporting molecules of all SV calls identified by OGM are shown in Table S2. Confidence in OGM was defined as a normalized confidence score between 0 and 1; higher values indicate higher confidence. It has been reported that intrachromosomal fusions often exhibit lower confidence scores because of a higher probability of similar labeling patterns occurring by chance within the same chromosome¹³. Although intrachromosomal fusions also had low OGM confidence scores (of ~0.1–0.3) in the present study, SV calls were judged to be highly reliable from supporting molecules.

Table 1 Summary of the phenotypes and CNVs of patients in this study

Full size table

Table 2 Overview of structural variants identified by OGM in this study

Full size table

Table 3 Overview of chromothripsis in Case 30

Full size table

To further characterize the SVs in all 30 cases, we used Oxford Nanopore targeted long-read sequencing with AS (Table S3). This approach allowed us to precisely determine the SV breakpoints at the nucleotide level and generate dot plots to visualize the identified SVs.

OGM and long-read sequencing provide complementary and comprehensive detection of diverse SVs, including those invisible to ES. The representative cases demonstrated that integrating optical and sequence-level data enhances both the diagnostic yields and the mechanistic understanding of SV formation. Here, we present findings obtained through OGM analysis and long-read sequencing with AS. A detailed analysis of each case is provided in the Supplementary Information.

Deletions (Cases 1, 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 14, 16, 19, and 25)

We performed OGM in 16 cases with ES-based CN losses, and detected only simple (expected) deletions in 15 cases; however, a chromoplexy-like event was identified in one case (Case 29) (Table 2). Breakpoints at nucleotide resolution were successfully determined in 11 of the 15 cases with simple deletions: interstitial deletions in Cases 3, 8, 9, 10, 11, 12, 14, 16, 19, and 25, and a deletion involving 4q telomeric sequences (TTAGGG) in Case 2 (Figs. S2 and S3).

Among these 11 cases, although ES-based analysis did not identify haploinsufficient genes, we identified SUMO1 (pHaplo score = 0.900¹⁴, ClinGen haploinsufficiency [HI] score = not recorded¹⁵) in Case 12 and COL11A1 (pHaplo score = 0.894¹⁴, HI score = 1¹⁵) in Case 25 (Supplementary Data 1^15,16). SUMO1 deletion causes orofacial cleft (OMIM# 613705) and SUMO1 haploinsufficiency has been associated with cleft lip and palate, consistent with the phenotype observed in Case 12^17,18. In Case 25, ES had already identified KCNA2 (OMIM# 616366) as a cause of seizures and global developmental delay^19,20. Our analysis additionally revealed COL11A1, whose haploinsufficiency causes Marshall (OMIM# 154780) and Stickler (OMIM# 604841) syndromes, both of which are associated with abnormal facial features, failure to thrive, and short stature, as observed in Case 25^21,22. Upon reviewing the ES data, we identified the same regions as simple CN losses in ES data; however, the corresponding genes had not been called by the eXome Hidden Markov Model (XHMM). These cases underscore the importance of OGM and long-read sequencing and of reviewing the ES data, thus demonstrating its limitations.

CTBP1 was disrupted by the deletion breakpoint in Case 2 (Table 2). CTBP1 aberration is associated with hypotonia, ataxia, developmental delay, and tooth enamel defect syndrome (OMIM# 617915) in an autosomal dominant fashion, and may influence the neurological features observed in this case²³. Long-read analysis enabled us to determine the precise location at which the gene was disrupted.

In Cases 1, 4, 5, and 13, although deletions were identified using OGM (Table 2 and Fig. S4), the breakpoint determination of deletions using long-read sequencing was unsuccessful. Segmental duplications were exactly mapped to deletion breakpoints, and long-read sequencing cannot cover these segmental duplications (Fig. S4). Cases 1, 4, 5 and 13 are described in the Supplementary information.

Duplication (Case 28)

We identified an ES-based CN gain in Case 28. OGM detected the expected duplication at the terminal region of chromosome 2q (Fig. S5). However, we were unable to fully characterize the duplication using both OGM and long-read sequencing because of the presence of many repetitive sequences at the terminal region of the chromosome. Case 28 is described in detail in the Supplementary information.

Unbalanced translocations (Cases 15, 17, and 26)

In Cases 15, 17, and 26, unbalanced translocations—characterized by combined terminal CN gains and CN losses—were detected using ES. OGM also identified unbalanced translocations in these three cases.

Case 15 presented clinically with West syndrome, hypoplasia of the corpus callosum, and aortic regurgitation. In this case, an unbalanced translocation was detected using OGM and AS, with a 3.5-Mb deletion at 1p36.33–p36.32 and a 3.8-Mb duplication at 17p13.3–p13.2 (Fig. 1a, b, Table 2, and Table S4). The deletion encompassed GABRD (pHaplo score = 0.807¹⁴, HI score = 3¹⁵), which is associated with susceptibility to generalized epilepsy with febrile seizures plus, type 5 (GEFSP5) (OMIM# 613060) and has been reported to cause epileptic symptoms in 1p36 deletion syndrome²⁴ (Table 1 and Fig. 1b).

**Fig. 1: Results of OGM and long-read sequencing in Cases 15 and 18.**

Although no known triplosensitive genes were initially identified within the 17p13.3–p13.2 duplication, precise mapping using OGM identified that this duplication also encompassed PAFAH1B1, with pTriplo score = 0.998¹⁴ and ClinGen triplosensitivity [TS] score = 0¹⁵ (Supplementary Data 1). Duplications of PAFAH1B1 have been linked to West syndrome²⁵, seizures, and a range of structural brain anomalies, including abnormalities of the corpus callosum as a part of 17p13.3 duplication syndrome²⁶. Thus, the duplicated PAFAH1B1 may account for the hypoplasia of the corpus callosum that was observed in this patient. To confirm the origin of this unbalanced translocation, breakpoint-spanning PCR was performed using long-read sequencing data (primers listed in Table S5); this confirmed its paternal origin (Fig. 1d, e). Sanger sequencing of the translocation breakpoints further identified a 3-bp insertion on der(1) and a 2-bp insertion on der(17) (Fig. 1f). Although TP73 was disrupted by a translocation (Table 2), none of the disrupted genes were associated with the clinical features observed in Case 15.

The same analyses using OGM were also performed for Cases 17 (Fig. S6) and 26 (Fig. S7); however, no additional gene aberrations relevant to the phenotype were detected using OGM. Cases 17 and 26 are described in the Supplementary information.

Direct duplications (Cases 7, 18, and 24)

OGM identified direct duplications in three cases (Fig. 1g and Fig. S8a, d), which had previously been detected by ES. Long-read sequencing data confirmed the breakpoints in these cases and enabled the reconstruction of direct duplications through dot plots (Fig. 1h and Fig. S8b, e). This approach revealed no additional aberrant genes or gene disruption relevant to the phenotype. Cases 7, 18, and 24 are described in detail in the Supplementary information.

Duplication with inversion (Case 6)

In Case 6, we newly identified a 6.2-Mb inversion at 21q22.2–q22.3 and a 218-kb duplication at 21q22.2 using OGM with AS. The CN gain had previously been detected using ES (Tables 2 and S4). OGM analysis revealed that the inversion was located immediately downstream of the duplication (Fig. 2a), thus highlighting a limitation of ES, which typically fails to detect breakpoints outside coding regions. Furthermore, OGM analysis revealed that a CN loss previously identified by ES was the deletion from 21q22.3 to 21qter, located downstream of this inversion, with the 3′ end of the inversion corresponding to the end of the long arm of chromosome 21 (Fig. 2a).

**Fig. 2: Results of OGM and long-read sequencing in Cases 6 and 23.**

Long-read sequencing data further illustrated these SVs through dot-plot reconstruction (Fig. 2b). The CN loss, previously identified by ES, encompassed COL6A1 (pHaplo score = 0.942¹⁴, HI score = not registered¹⁵), whose haploinsufficiency causes Bethlem myopathy 1A (OMIM# 158810) or Ullrich congenital muscular dystrophy 1A (OMIM# 254090), and COL6A2 (pHaplo score = 0.593¹⁴, HI score = not registered¹⁵), whose deletion results in Bethlem myopathy 1B (OMIM# 620725) or Ullrich congenital muscular dystrophy 1B (OMIM# 620727). In addition, the deletion of DIP2A (pHaplo score = 0.916¹⁴, HI score = 1¹⁵) is associated with neurodevelopmental disorder²⁷ and autism²⁸ (Table 1 and S3). However, no additional aberrant haploinsufficiency or triplosensitive genes associated with hereditary spastic paraplegia or autism were identified within these regions using OGM (Supplementary Data 1). Although TMPRSS2 and COL18A1 were disrupted by the SVs (Table 2), neither has been associated with hereditary spastic paraplegia or autism.

Inverted duplications (Cases 23 and 27)

OGM identified inverted duplications in Cases 23 and 27, involving complex rearrangements that combined deletions and duplications (Fig. 2d and Fig. S9a). Long-read sequencing successfully confirmed the breakpoints in both cases (Fig. 2e and Fig. S9b).

In Case 23, who was clinically suspected to have Cornelia de Lange syndrome, OGM revealed a 14.3-Mb deletion and an 851-kb duplication at 9p22.3, resulting in inverted duplication (Table 2 and Table S4). Notably, NFIB (pHaplo score = 0.999¹⁴, HI score = 3¹⁵) was disrupted at intron 1 by the SV (Fig. 2d, Table 2, and Supplementary Data 1). Haploinsufficiency of NFIB causes macrocephaly with impaired intellectual development (OMIM# 618286) as well as intellectual disability and dysmorphic features^29,30. These clinical features are consistent with those observed in Case 23, including intellectual disability and dysmorphic features.

Case 27 involved an inverted duplication rearrangement detected by OGM with AS (Table 2 and Supplementary Data 1), similar to a previously reported case³¹. Within the 18.5-Mb deletion at 9p22.1–24.3, both NFIB and SMARCA2 were identified (Fig. S9b and Supplementary Data 1). As previously mentioned, NFIB haploinsufficiency can contribute to intellectual impairment, dysmorphic facial features, and a high palate (Table 1). SMARCA2 (pHaplo score = 0.999¹⁴, HI score = 0¹⁵) is associated with blepharophimosis-impaired intellectual development syndrome (OMIM# 619293) and Nicolaides–Baraitser syndrome (OMIM# 601358) in an autosomal dominant manner, and its intragenic deletions reportedly cause intellectual disability, seizures, and dysmorphic features^32,33. Although SMARCA2 deletion has not been clearly established as a distinct deletion syndrome, it has been previously reported with limited clinical information³⁴. By contrast, the adjacent 19.7-Mb duplication did not contain any known genes associated with human diseases (Supplementary Data 1). Notably, although NFIB-AS1 and ADAMTSL1 were disrupted by inverted duplication in each of Cases 23 and 27 (Table 2), neither gene was associated with the clinical features observed in these cases.

Except in rare cases where one breakpoint coincides with an inverted duplication and sequencing reads spanning two different exons are present, ES is generally incapable of identifying inverted duplications and their breakpoints, because orientation information is not available. By contrast, the present approach yielded meaningful findings that were linked to clinical symptoms in these two cases.

Intrachromosomal insertion events with inverted duplications (Cases 20–22)

Using OGM with AS, we identified novel intrachromosomal insertion events and inverted duplications in three cases (Cases 20–22) in whom CN gains or combined CN losses and CN gains had initially been detected by ES. OGM revealed a range of complex structural rearrangements in these cases.

Case 20 involved an inverted duplication, a deletion, and an intrachromosomal fusion (Fig. S10, Table 2, and Table S4). In Case 21, an inverted duplication and two intrachromosomal fusions were identified (Fig. S11, Table 2, and Table S4). Case 22 exhibited an inverted duplication and an intrachromosomal fusion (Fig. 3, Table 2, and Table S4).

**Fig. 3: Results of OGM and long-read sequencing in Case 22.**

Regarding these chromosomal rearrangements, Case 20 included an 8.4-Mb duplication at Xq21.33–q22.3, whereas Case 21 harbored a 439-kb duplication at Xq22.2, as detected by OGM. In both cases, the newly resolved SVs identified by OGM and long-read sequencing did not reveal any additional pathogenic variants that might explain the patients’ clinical features (Supplementary Data 1). Cases 20 and 21 are described in detail in the Supplementary information.

By contrast, Case 22 exhibited a more complex pattern involving a 10.8-Mb duplication at 18p11.32–p11.22, a 2.8-Mb duplication at 18q21.31–21.33, and an 18.9-Mb deletion at 18q21.33–q23 (Fig. 3a). Within the deleted 18q21.33–q23 region, ES identified TSHZ1 and NFATC1, which have been associated with auricular deformities, hearing impairment^35,36, and congenital heart disease^37,38 (Table 1 and Supplementary Data 1). Nevertheless, in the newly characterized SVs using OGM and long-read sequencing, no additional candidate pathogenic genes were identified that might account for the patient’s clinical presentation, which included features potentially resembling Cornelia de Lange syndrome (Supplementary Data 1). Additionally, PIEZO2, which is associated with Marden–Walker syndrome (OMIM# 248700) through an autosomal dominant mechanism, was disrupted by SVs in Case 22 (Table 2). PIEZO2 has also been reported as associated with Cornelia de Lange syndrome³⁹, and may be involved in the phenotype observed in this case.

Notably, the intrachromosomal fusions in these cases were positioned adjacent to their respective CN gains—a feature that was missed by ES because of the copy-neutral nature of these SVs. Furthermore, three cases exhibited complex multi-breakpoint events, meaning that it was difficult to understand the structural reconstruction using long-read sequencing alone; OGM, which enables the visualization and overview of complex SVs, was therefore extremely helpful.

Chromoplexy-like event in Case 29

In Case 29, who presented with Ehlers–Danlos syndrome, joint laxity, arthritis, and mild intellectual disability, ES previously identified a 5.2-Mb CN loss at 12q21.1–q21.2. OGM further revealed an intrachromosomal insertion event and inversion within chromosome 12, an inversion within chromosome 14, and a translocation between chromosomes 12 and 14 (Fig. 4a, b). Long-read sequencing precisely determined all of the breakpoints (Fig. 4c, d), confirming a chromoplexy-like event between chromosomes 12 and 14 (Fig. 4l and Table S4). Chromoplexy refers to a large-scale and complex formation of chain rearrangements that affects multiple chromosomes. This phenomenon is characterized by the interdependent occurrence of multiple inter- and intrachromosomal translocations and deletions⁴⁰. Chromoplexy leads to the generation of derivative chromosomes that present little or no CN alterations, and Case 29, with CN loss, presented a chromoplexy-like event. Conversely, chromothripsis is defined as a relatively clustered chromosomal shattering followed by a random restitching of chromosomal fragments, which results in the formation of complex genomic rearrangements⁴⁰. Chromothripsis was observed in Case 30.

**Fig. 4: Results of OGM and long-read sequencing in Case 29.**

The chromoplexy-like event involved chromosomes 12 and 14, with reciprocal inverted insertions, partial deletions, and inversion of multiple segments. Despite the complex rearrangement, no clearly haploinsufficient gene associated with the patient’s phenotype was identified within the deleted region (Supplementary Data 1). PTPRR and TRHDE were disrupted by a reciprocal translocation (Table 2), but neither was associated with the clinical features observed in Case 29. The derivative chromosome 12 harbored the 5.2-Mb deletion at 12q21.1–q21.2, which was joined to chromosome 14 without a CN change. Therefore, although the CN loss was detectable by ES-based CNV analysis, the reciprocal translocation itself was not identified. Because the deleted region did not encompass any previously known disease-associated genes that would explain the clinical features, this case clearly demonstrates the limitations of ES for detecting complex SVs and highlights the added value of OGM and long-read sequencing in cases with no apparent clinical features.

Chromothripsis (Case 30)

In Case 30, who presented with vitreoretinal degeneration, short stature, growth delay, abnormal facial shape, calcified skin lesions, and ventricular septal defect, ES detected eight CN gains in chromosome 22, implying chromothripsis (Fig. 5a). The chromothripsis is thought to arise when one homolog of chromosome 22 undergoes aberrant segregation due to anaphase-lagging and becomes encapsulated within a micronucleus. Extensive DNA damage incurred within the micronucleus is followed by erroneous reassembly upon reincorporation into the primary nucleus, resulting in abnormal chromosomal rearrangements⁴¹. Based on ES data, the CNV sizes of the eight CN gains at 22q12.1–q13.33 ranged from 107–437 kb. OGM analysis revealed six intrachromosomal fusions, one deletion, one inversion, and one inverted duplication with ogm (22q)cth at 22q11.21–q13.33 (Fig. 5b, Fig. S11, and Table 3). Although we attempted to reconstruct the full spectrum of SVs based on the OGM results, the SVs identified by OGM did not completely match those detected by dnarrange using long-read sequencing data. We therefore manually evaluated the breakpoint sequences using long-read sequencing data, considering the orientations of the SVs and CNVs called using OGM (Fig. 5c). The CNV sizes of the eight duplications in a dot-plot of long-read sequencing data were 390 kb at 22q11.21, 419 kb at 22q12.1–q12.2, 460 kb at 22q12.2, 238 kb at 22q12.3, 884 kb at 22q13.1, 335 kb at 22q13.1, 466 kb at 22q13.1, and 625 kb at 22q13.31; all eight of these were revealed to be three CN gains. These regions included 6, 9, 7, 4, 31, 15, 14, and 7 genes, respectively (Supplementary Data 1); however, none of the genes were associated with known phenotypes. In addition, six genes were disrupted by the chromothripsis (Table 3), but none of these were associated with the clinical features observed in Case 30. We also investigated the pathogenicity of triplosensitive regions within all eight duplications in case 30 by using ClinGen. Duplications of the 22q11.2 region (TS score = 1¹⁵) have been reported as causing developmental delay, dysmorphisms, and multiple congenital abnormalities⁴², suggesting they may cause the growth delay and abnormal facial shape observed in case 30. However, no cause for other phenotypes in this case could be identified. Ultimately, the combined analysis of OGM and targeted long-read sequencing provided a comprehensive and accurate identification of the chromothripsis (Table S4).

**Fig. 5: Results of the CNV analysis using ES data, OGM, and long-read sequencing in Case 30.**

Discussion

We identified novel SVs using OGM with AS in 14 of 30 cases (Cases 6, 7, 15, 17, 18, 20–24, 26, 27, 29, and 30) in whom CNVs had previously been detected using ES. These SVs were undetectable by ES alone because of unstable genomic coverage, limited fragment size, and analytical limitations. ES typically misses SVs with breakpoints outside coding regions and cannot resolve the configuration of CN gains or unbalanced translocations inferred from telomeric CNVs. By contrast, OGM uses ultra-high molecular weight DNA and genome-wide mapping to detect large and complex SVs—including inversions, inverted duplications, and intrachromosomal fusions—outside ES target regions (Cases 6 and 20–22) and within repetitive regions (Case 26). Importantly, OGM with AS revealed disruptions (Cases 2 and 22) and CN changes (Cases 12, 15, 23, 25, and 27) in genes that are likely associated with clinical features (Table 2 and Supplementary Data 1) (7/30 cases, 23.3%), clearly demonstrating the utility of OGM with AS following CNV detection.

Most novel SVs were observed in cases involving CN gains (Cases 6, 7, 15, 17, 18, 20–24, 26, 27, and 30), with only one additional SV found in a case of CN loss (Case 29). Notably, in cases in which ES detected both CN gain and CN loss, OGM clarified the underlying events as unbalanced translocations (Cases 15, 17, and 26) or inverted duplications (Cases 23 and 27), which were undetected by ES. These findings suggest that such CNV patterns should raise a suspicion of hidden SVs, warranting further investigation with OGM.

Compared with ES, OGM achieved a superior detection of balanced (copy neutral) and complex (inversions, intrachromosomal fusions, chromoplexy, or chromothripsis) SVs that lacked CN changes, whereas ES remained sensitive to CNVs involving coding regions. Long-read sequencing complemented OGM by defining nucleotide-level breakpoints, although its limited read length sometimes prevented full structural reconstruction. Thus, OGM provides a genome-wide structural overview, whereas long-read sequencing ensures breakpoint precision; together, they offer comprehensive SV characterization.

OGM also enables CNV detection through molecule coverage-based analysis, to identify CNVs larger than 500 bp⁴³. In the present study, OGM detected CNVs ranging from 238 kb to 18.9 Mb. Although pathogenic genes were identified within these CNVs in six cases (Cases 12, 15, 23, 25, and 27), they had already been included within regions detected by ES, suggesting that ES can capture certain pathogenic CNVs if they are covered by baits.

Several factors may explain why some haploinsufficient genes were not captured by ES. Large CNVs spanning multiple genes can dilute read-depth changes, and capture efficiency varies across exons because of GC-rich or probe hybridization biases. Although the affected genes are generally covered by baits, exonic read-depth variation or pipeline filtering thresholds can lead to missed calls. In our laboratory, CNV detection was performed using an XHMM-based pipeline that was optimized for batch- and sex-specific analysis to minimize sequencing bias; however, this process was not perfect^2,44.

ES was performed between 2015 and 2022 using standardized pipelines from our laboratory. Although the retrospective visual inspection of BAM alignments was conducted for selected loci, no clear breakpoint-supporting reads were observed. The undetected SVs are therefore likely attributable to the limitations of read depth–based algorithms rather than sequencing artifacts. Previous studies have revealed that incorporating one-end–anchored split-read mappings into ES analysis can reveal SVs that are otherwise missed by standard read-depth approaches⁴⁵. Furthermore, CNV and SV calling software increasingly combines multiple data signatures—including read-pair, split-read, read-depth, and assembly-based methods—for improved detection accuracy and breakpoint precision⁴⁶. Together, these findings suggest that further improvements in ES pipelines, such as the integration of read-pair or split-read analysis, may enhance SV detection sensitivity.

OGM and long-read sequencing detect SVs through fundamentally different mechanisms. Although long-read sequencing provides single-nucleotide resolution and accurate breakpoint determination, it is constrained by physical read lengths (10–100 kb), often requiring multiple reads to reconstruct complex SVs⁶. By contrast, OGM analyzes molecules spanning hundreds of kilobases, enabling the better characterization of biallelic CNVs and phasing of CN changes⁴⁷.

For complex rearrangements with multiple breakpoints—such as intrachromosomal insertion events and inverted duplications (Cases 20–22), chromoplexy-like events (Case 29), or chromothripsis (Case 30)—OGM was valuable for structural delineation, with long-read sequencing used for precise breakpoint resolution. In Case 30, discrepancies in breakpoint orientation and CN between OGM and long-read sequencing illustrated the difficulty of resolving such events with sequencing alone. The genome-wide view of OGM and its ability to detect both CNVs and SVs are particularly advantageous for analyzing chromoanagenesis, including chromothripsis⁴⁸.

In the present study, OGM was particularly advantageous for delineating complex multi-breakpoint events (Cases 20–22, 29, and 30) in which complete reconstruction was difficult to perform using long-read sequencing alone. However, for translocation, direct duplications, or inverted duplications, long-read sequencing alone may suffice. OGM and long-read sequencing can also be complementary: the former provides superior structural mapping, whereas the latter is better at refining breakpoints.

Compared with ES, which primarily targets coding regions and relies on read depth–based CNV detection, combined OGM and long-read sequencing offers genome-wide coverage and direct breakpoint visualization, thereby enabling the accurate identification of balanced or complex SVs that are invisible to ES. In contrast to short-read whole genome sequencing (srWGS), which can capture small CNVs and SNVs genome-wide but often struggles to resolve long-distance rearranged regions, the integration of OGM and long-read sequencing overcomes these limitations by providing both optical- and nucleotide-level mapping continuity.

All 14 SVs detected using OGM were validated with high accuracy. However, the resolution limit (~5 kb) of OGM means that smaller variants may be missed^6,49. Breakpoint positions between OGM and long-read sequencing differed by 0.5–385.5 kb (Tables 2 and 3), often because of unmapped regions in OGM data. Larger unmapped regions correlated with greater discrepancies. Conversely, smaller gaps allowed near-precise breakpoint localization.

In Case 26, in which all reads included repetitive sequences, OGM accurately detected SVs, whereas long-read sequencing yielded mismapped reads, failing to detect the true breakpoints. This highlights the strength of OGM in repetitive and complex regions, leveraging long-range DNA molecules without sequence-based alignment^11,50.

Despite its advantages, OGM cannot fully resolve SVs in centromeres and telomeres¹¹. Whole-arm rearrangements such as Robertsonian translocations or isodicentric chromosomes remain challenging because of repetitive elements and gaps in the reference genome^3,43,48. Additionally, telomere fusions may obscure insertion sites⁴³. However, complete chromosome assemblies (e.g., of the X chromosome) are becoming feasible via the integration of T2T reference genomes and OGM^51,52.

Recent advances in genomic technologies have exhibited a clear trajectory, from ES to srWGS and onward to long-read sequencing and OGM. For example, in a large cohort of 722 undiagnosed patients with inherited retinal disease, a comparison of ES and srWGS demonstrated that srWGS using SV scanning identified molecular diagnoses in 4.29% of cases⁵³, demonstrating an added—but modest—gain over ES. More strikingly, a recent long-read whole genome sequencing (lrWGS) study of 34 exome-negative autosomal recessive families achieved a 38% diagnostic yield⁵⁴, thus underscoring the major improvement in SV detection afforded by long reads. Our study built on this idea by integrating OGM and long-read sequencing, and demonstrated that this combined approach was able to resolve SVs—including balanced and complex rearrangements—that both ES and srWGS often miss. Specifically, in our cohort, the combined method delineated multi-breakpoint events that long-read sequencing alone may not fully reconstruct. Our findings indicate that integrating OGM and AS after ES-based CNV discovery increases diagnostic resolution and clarifies complex rearrangements. This ES–OGM–long-read sequential approach refines SV interpretation and establishes a framework for integrating genomic and sequencing data in rare disease diagnostics.

The present study has some limitations. First, because all patients were referred from tertiary or specialized clinical centers, a potential bias toward severe or atypical cases cannot be excluded. Moreover, the small sample size (30 cases were selected according to the selection criteria from 281 cases with CNVs identified by XHMM analysis in ES, out of a total of 8622 cases with rare genetic disorders who underwent ES analysis in our laboratory between 2015 and 2022; please refer to the Methods for the selection criteria) and technical differences/improvements in ES collected over multiple years may affect the generalizability of our results. Further large-scale prospective studies are needed to evaluate the true diagnostic yield and reproducibility of the OGM and long-read sequencing pipeline. Second, although the combined OGM and long-read sequencing approach provides maximal structural resolution, its high cost currently limits its routine clinical application. Although 13 of 14 cases (excluding Case 30) may have been resolved by long-read sequencing alone, complete reconstruction using only long-read sequencing is likely to be considerably difficult, especially considering complex multi-breakpoint events such as those of Cases 20–22 and 29. A tiered strategy—for example, starting with ES, followed by OGM or long-read sequencing for unresolved or structurally complex cases—may allow rapid solutions to be achieved using a relatively cost-effective and diagnostically efficient workflow. Such a layered approach balances cost, resolution, and interpretability, thereby offering a practical framework for clinical implementation.

Collectively, our results highlight the potential of OGM and long-read sequencing for bridging the gap between CNV detection and full SV interpretation. In clinical practice, this approach may enhance diagnostic accuracy, inform personalized management, and guide genetic counseling for patients with unexplained rare genetic/genomic disorders by sequencing alone, and for those with dual or triple molecular diagnoses. Future research should systematically assess the pathogenic importance of newly identified SVs and explore the integration of OGM and long-read sequencing data into clinical pipelines.

In conclusion, CN gains or CN gain–CN loss patterns detected using ES should prompt the suspicion of hidden SVs, particularly when CNVs do not fully explain the phenotype. Highly complex SVs, which are often unresolvable by sequencing alone, can be effectively characterized by combining OGM and long-read sequencing. OGM is thus a powerful tool for identifying complex chromosomal rearrangements within its resolution range. Combining OGM with targeted long-read sequencing provides a robust and scalable strategy for comprehensive SV characterization, with direct implications for improving genetic diagnosis and personalized medicine in rare disorders.

Methods

Patients

The study was performed on 30 patients with different disease profiles (Table 1). All patients were clinically evaluated and referred to our laboratory from July 2015 to August 2022. ES had been performed on the patients using the method described in the following section, and all 30 patients had CNVs identified using ES. The ethnicities of patients were Japanese (25 patients) and Brazilian (five patients). The study protocol was approved by the institutional review boards, and written informed consent was obtained from patients or their guardians (IRB of Yokohama City University Faculty of Medicine, Approval number: A170525011/RevB2602000031). The cohort consisted of patients with congenital multiple anomalies who were referred to our laboratory from the pediatrics, neurology, clinical genetics, and obstetrics departments of collaborating institutions/hospitals. To elucidate the causes of rare genetic disorders, we selected 30 cases from 281 cases with CNVs identified by XHMM analysis in ES, out of a total of 8622 cases with rare genetic disorders that underwent ES analysis in our laboratory between 2015 and 2022. Inclusion criteria were as follows: (1) cases with non-recurrent CNVs identified by ES, in which the causative genes within CNVs did not fully explain the phenotypes, and (2) cases with lymphoblastoid cell lines available in our laboratory. Within this cohort, cases were selected equally from three groups: CN losses, CN gains, and CN gain–CN loss patterns. Exclusion criteria included: (1) cases with CNVs that were already reported to cause recurrent deletion or duplication syndromes, and (2) cases with unavailable lymphoblastoid cell lines. An overview of the study workflow, from ES-based CNV detection through OGM analysis to targeted long-read sequencing using AS, is shown in Fig. S13.

CNV analysis using whole ES

Genomic DNA was extracted from peripheral blood leukocytes using a QuickGene-610L kit (Fujifilm, Tokyo, Japan), according to the manufacturer’s instructions. Genomic DNA samples were sequenced on either the NovaSeq 6000 (Illumina, San Diego, CA, USA) using 150-bp paired-end reads after the enrichment of exonic regions using the Twist Human Comprehensive Exome (Twist BioScience, San Francisco, CA, USA) or the HiSeq2000 or HiSeq2500 (Illumina) with 101-bp paired-end reads after capture using the SureSelect Human All Exon kit (Agilent Technologies, Santa Clara, CA, USA). Exome data processing was performed as previously described⁵⁵. Reads were aligned to GRCh38 using Novoalign (http://www.novocraft.com/), and PCR duplicates were eliminated using Picard (https://broadinstitute.github.io/picard/). The Genome Analysis Toolkit (GATK) (https://gatk.broadinstitute.org/hc/en-us) was used to realign indels and recalibrate base quality scores. With 20 reads covering >90% of the complete coding sequence, the mean depth of coverage of ES against the RefSeq coding sequence was 70–80×.

CNVs were detected from the ES data using XHMM, a statistical tool for CN analysis^2,44,56. In brief, XHMM detects CNVs by analyzing normalized raw exome read depth data with principal component analysis of the complete coding regions. In Case 30, principal component analysis-normalized and filtered z-scores for the whole read depth were obtained, and SignalMap Version 1.9.0.05 (Roche Nimblegen, Madison, WI, USA) was used for visualization purposes. All CNVs were confirmed using quantitative PCR analysis of patients and their unaffected parents, and inheritance was assumed to be de novo or uniparental (Table 1).

OGM

Ultra-high molecular weight DNA was extracted from 1.5 million lymphoblastoid cell lines using the SP-G2 Blood and Cell Culture DNA Isolation Kit (Bionano Genomics, San Diego, CA, USA) following the manufacturer’s instructions⁵⁷. DNA was labeled using the DLS-G2 Labeling Kit (Bionano Genomics) for the attachment of fluorophores to a specific six-nucleotide sequence (CTTAAG). The labeled DNA was loaded on a Saphyr chip G3.2 and run on a Saphyr instrument (Bionano Genomics) for an output of 400 Gb, targeting 80× effective coverage, with >70% of molecules >150 kb aligning (map rate) at N50 (N50 is defined as the length of the shortest molecule for which equal and longer molecules make up 50% of the total data) of >230 kb⁸ (Table S1). De novo assembly, alignment to the GRCh38 reference, and variant annotation of OGM data were performed using Bionano Solve 3.8.1. Visualization by Circos plot and genome browser, filtering, and interpretation of the identified SVs were conducted using Bionano Access 1.8.1. Standard filter settings were adjusted to select only the variants present in the regions of interest identified by previous standard diagnostic tests and in <1% of control samples. In Case 28, OGM data were analyzed using the T2T reference genome in addition to GRCh38 to observe any differences.

Targeted long-read sequencing using AS with GridION

Here, 3 μg of the ultra-high molecular weight DNA used in OGM was fragmented to a target size of 50 kb using a Megaruptor 3 DNAFluid Kit (Diagenode, Seraing, Belgium). In cases in which the remaining amount of ultra-high molecular weight DNA was low, 5 μg of genomic DNA extracted from peripheral blood leukocytes was fragmented to a target size of 40 kb using a Megaruptor 3 Shearing Kit (Diagenode). The sheared DNA was used to construct sequencing libraries using an Oxford Nanopore Ligation Sequencing Kit (SQK-LSK114) (Oxford Nanopore Technologies, Oxford, UK), largely following the manufacturer’s instructions^55,58 except that enzyme incubation times were doubled, with the final AMPure purification incubation for 10 min at 37 °C. Approximately 25 fmol of the library was loaded onto a flow cell (FLO-MIN114, R10.4.1) on a GridION (Oxford Nanopore Technologies). Target regions comprising 0.24%–0.98% of the whole genome were enriched using the AS option⁵⁹ of GridION Mk1 with a BED file that assigned regions of CNVs detected by ES and each of the surrounding total 7.3–29.5 Mb regions (Table S3). Sequencing was performed for ~3 days, with one additional library loading after nuclease flushing of a flow cell. In cases in which the amount of data was insufficient, new flow cells were used for additional analysis as appropriate.

The data were base-called with super accuracy mode and processed into a BAM file using Dorado v.0.7.2 (Oxford Nanopore Technologies) with GRCh38 as a reference. The depth of coverage was calculated using mosdepth v0.3.1 (https://github.com/brentp/mosdepth). The mean depth of the target region was 11.72×–47.38× (Table S3). SVs and CNVs were called using pbsv v.2.9.0 (PacBio SV calling and analysis tools) (https://github.com/PacificBiosciences/pbsv) together with breakpoint identification. dnarrange was used to detect SVs and extract patient-specific SVs by referencing 29 control datasets^57,60. Control datasets were not used in the analysis if dnarrange excluded breakpoints of interest. Subsequently, using lamassemble (https://gitlab.com/mcfrith/lamassemble), each group of overlapping SV reads was merged into a consensus sequence and realigned to the reference genome. Next, dnarrange created a dot-plot representation of SV breakpoints. In dot plots, lines show the alignment to the reference sequence and are drawn in red and blue to denote forward and reverse orientations to the reference genome, respectively. To understand entire SVs, we used the algorithm dnarrange-link to infer the order and orientation of multiple rearrangements⁶¹.

Data availability

The datasets for this article are not publicly available because of concerns regarding patients’ anonymity. Requests to access the datasets from qualified researchers should be directed to the corresponding author. There are restrictions on a qualified researcher accessing the data (non-commercial use only and requiring a Data Usage Agreement).

Code availability

All computational tools used in this study are available as open-source software, and their download links are presented in the Methods. Software used are: Dorado v.0.7.2 (https://github.com/nanoporetech/dorado), mosdepth v0.3.1 (https://github.com/brentp/mosdepth), lamassemble 1.4.2 (https://gitlab.com/mcfrith/lamassemble), and dnarrange (https://github.com/mcfrith/dnarrange) for targeted long-read sequencing using AS with GridION.

References

Sudmant, P. H. et al. An integrated map of structural variation in 2504 human genomes. Nature 526, 75–81 (2015).
Article PubMed CAS Google Scholar
Uchiyama, Y. et al. Efficient detection of copy-number variations using exome data: batch- and sex-based analyses. Hum. Mutat. 42, 50–65 (2021).
Article PubMed CAS Google Scholar
Mantere, T. et al. Optical genome mapping enables constitutional chromosomal aberration detection. Am. J. Hum. Genet. 108, 1409–1422 (2021).
Article PubMed PubMed Central CAS Google Scholar
Xu, P. et al. OGM and WES identifies translocation breakpoints in PKD1 gene in an polycystic kidney patient and healthy baby delivered using PGT. BMC Med.Genomics 16, 285 (2023).
Article PubMed PubMed Central CAS Google Scholar
Wang, Y., Zhao, Y., Bollas, A., Wang, Y. & Au, K. F. Nanopore sequencing technology, bioinformatics and applications. Nat. Biotechnol. 39, 1348–1365 (2021).
Article PubMed PubMed Central CAS Google Scholar
De Clercq, G. et al. Full characterization of unresolved structural variation through long-read sequencing and optical genome mapping. Sci. Rep. 14, 29142 (2024).
Article PubMed PubMed Central Google Scholar
Sahajpal, N. S., Barseghyan, H., Kolhe, R., Hastie, A. & Chaubey, A. Optical genome mapping as a next-generation cytogenomic tool for detection of structural and copy number variations for prenatal genomic analyses. Genes https://doi.org/10.3390/genes12030398 (2021).
Broeckel, U. et al. Detection of constitutional structural variants by optical genome mapping: a multisite study of postnatal samples. J. Mol. Diagn. 26, 213–226 (2024).
Article PubMed CAS Google Scholar
Zhang, S. et al. Detection of cryptic balanced chromosomal rearrangements using high-resolution optical genome mapping. J. Med. Genet. 60, 274–284 (2023).
Article PubMed CAS Google Scholar
Chan, S. et al. Structural variation detection and analysis using bionano optical mapping. Methods Mol. Biol. 1833, 193–203 (2018).
Article PubMed CAS Google Scholar
Neveling, K. et al. Next-generation cytogenetics: Comprehensive assessment of 52 hematological malignancy genomes by optical genome mapping. Am. J. Hum. Genet. 108, 1423–1435 (2021).
Article PubMed PubMed Central CAS Google Scholar
Gimeno, A. F., Tinker, R. J., Furuta, Y. & Phillips, J. A. 3rd. Prevalence of individuals with multiple diagnosed genetic diseases in the undiagnosed diseases network. Am. J. Med. Genet. A 197, e63888 (2025).
Article PubMed CAS Google Scholar
Iqbal, M. A. et al. Multisite assessment of optical genome mapping for analysis of structural variants in constitutional postnatal cases. J. Mol. Diagn. 25, 175–188 (2023).
Article PubMed PubMed Central CAS Google Scholar
Collins, R. L. et al. A cross-disorder dosage sensitivity map of the human genome. Cell 185, 3041–3055.e3025 (2022).
Article PubMed PubMed Central CAS Google Scholar
Riggs, E. R. et al. Towards an evidence-based process for the clinical interpretation of copy number variation. Clin. Genet. 81, 403–412 (2012).
Article PubMed CAS Google Scholar
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Article PubMed PubMed Central CAS Google Scholar
Shi, M. et al. Identification of microdeletions in candidate genes for cleft lip and/or palate. Birth Defects Res. A Clin. Mol. Teratol. 85, 42–51 (2009).
Article PubMed PubMed Central CAS Google Scholar
Alkuraya, F. S. et al. SUMO1 haploinsufficiency leads to cleft lip and palate. Science 313, 1751 (2006).
Article PubMed Google Scholar
McKnight, D. et al. Multigene panel testing in a large cohort of adults with epilepsy: diagnostic yield and clinically actionable genetic findings. Neurol. Genet. 8, e650 (2022).
Article PubMed Google Scholar
Truty, R. et al. Possible precision medicine implications from genetic testing using combined detection of sequence and intragenic copy number variants in a large cohort with childhood epilepsy. Epilepsia Open 4, 397–408 (2019).
Article PubMed PubMed Central Google Scholar
Richards, A. J. et al. Stickler syndrome and the vitreous phenotype: mutations in COL2A1 and COL11A1. Hum. Mutat. 31, E1461–E1471 (2010).
Article PubMed CAS Google Scholar
Annunen, S. et al. Splicing mutations of 54-bp exons in the COL11A1 gene cause Marshall syndrome, but other mutations cause overlapping Marshall/Stickler phenotypes. Am. J. Hum. Genet. 65, 974–983 (1999).
Article PubMed PubMed Central CAS Google Scholar
Danecek, P. et al. Detection and characterization of copy-number variants from exome sequencing in the DDD study. Genet. Med. Open 2, 101818 (2024).
Article PubMed PubMed Central Google Scholar
Cai, M. et al. Ultrasound phenotype, genetic analysis, and pregnancy outcomes of fetuses with 1p36 deletion syndrome. Mol. Genet. Genom. Med. 13, e70104 (2025).
Article CAS Google Scholar
Shimojima, K. et al. Genomic copy number variations at 17p13.3 and epileptogenesis. Epilepsy Res. 89, 303–309 (2010).
Article PubMed CAS Google Scholar
Hyon, C. et al. A new 17p13.3 microduplication including the PAFAH1B1 and YWHAE genes resulting from an unbalanced X;17 translocation. Eur. J. Med. Genet. 54, 287–291 (2011).
Article PubMed Google Scholar
Stessman, H. A. et al. Targeted sequencing identifies 91 neurodevelopmental-disorder risk genes with autism and developmental-disability biases. Nat. Genet. 49, 515–526 (2017).
Article PubMed PubMed Central CAS Google Scholar
Wilfert, A. B. et al. Recent ultra-rare inherited variants implicate new autism candidate risk genes. Nat. Genet. 53, 1125–1134 (2021).
Article PubMed PubMed Central CAS Google Scholar
Schanze, I. et al. NFIB haploinsufficiency is associated with intellectual disability and macrocephaly. Am. J. Hum. Genet. 103, 752–768 (2018).
Article PubMed PubMed Central CAS Google Scholar
Marinella, G. et al. Further characterization of NFIB-associated phenotypes: report of two new individuals. Am. J. Med. Genet. A 191, 540–545 (2023).
Article PubMed CAS Google Scholar
Recalcati, M. P. et al. Complex rearrangement involving 9p deletion and duplication in a syndromic patient: genotype/phenotype correlation and review of the literature. Gene 502, 40–45 (2012).
Article PubMed CAS Google Scholar
Wolff, D. et al. In-frame deletion and missense mutations of the c-terminal helicase domain of SMARCA2 in three patients with Nicolaides-Baraitser syndrome. Mol. Syndromol. 2, 237–244 (2012).
Article PubMed PubMed Central CAS Google Scholar
Foley, R. et al. Nicolaides-Baraitser syndrome in a patient with hypertrophic cardiomyopathy and SMARCA2 gene deletion. Cardiol. Young-. 32, 821–823 (2022).
Article PubMed Google Scholar
Perovic, D. et al. Chromosomal microarray in postnatal diagnosis of congenital anomalies and neurodevelopmental disorders in Serbian patients. J. Clin. Lab. Anal. 36, e24441 (2022).
Article PubMed PubMed Central CAS Google Scholar
Tassano, E. et al. Interstitial de novo 18q22.3q23 deletion: clinical, neuroradiological and molecular characterization of a new case and review of the literature. Mol. Cytogenet. 9, 78 (2016).
Article PubMed PubMed Central Google Scholar
Xiao, F. et al. Genetic spectrums and clinical profiles of critically ill neonates with congenital auricular deformity in the China Neonatal Genomes Project. Hum. Genet. 142, 1737–1745 (2023).
Article PubMed CAS Google Scholar
Silversides, C. K. et al. Rare copy number variations in adults with tetralogy of Fallot implicate novel risk gene pathways. PLoS Genet. 8, e1002843 (2012).
Article PubMed PubMed Central CAS Google Scholar
Glessner, J. T. et al. Increased frequency of de novo copy number variants in congenital heart disease by integrative analysis of single nucleotide polymorphism array and exome sequence data. Circ. Res. 115, 884–896 (2014).
Article PubMed PubMed Central CAS Google Scholar
Wu, T. & Chen, J. Prenatal diagnosis of Cornelia de Lange syndrome from 12 to 17 weeks’ gestation. Prenat. Diagn. 42, 1511–1513 (2022).
Article PubMed Google Scholar
Pellestor, F., Gaillard, J. B., Schneider, A., Puechberty, J. & Gatinois, V. Chromoanagenesis, the mechanisms of a genomic chaos. Semin. Cell Dev. Biol. 123, 90–99 (2022).
Article PubMed CAS Google Scholar
Zhang, C. Z. et al. Chromothripsis from DNA damage in micronuclei. Nature 522, 179–184 (2015).
Article PubMed PubMed Central CAS Google Scholar
Ou, Z. et al. Microduplications of 22q11.2 are frequently inherited and are associated with variable phenotypes. Genet. Med. 10, 267–277 (2008).
Article PubMed Google Scholar
Barseghyan, H. et al. Comparative benchmarking of optical genome mapping and chromosomal microarray reveals high technological concordance in cnv identification and additional structural variant refinement. Genes https://doi.org/10.3390/genes14101868 (2023).
Miyatake, S. et al. Detecting copy-number variations in whole-exome sequencing data using the eXome Hidden Markov model: an ‘exome-first’ approach. J. Hum. Genet. 60, 175–182 (2015).
Article PubMed CAS Google Scholar
Karakoc, E. et al. Detection of structural variants and indels within exome data. Nat. Methods 9, 176–178 (2011).
Article PubMed PubMed Central Google Scholar
Whitford, W., Lehnert, K., Snell, R. G. & Jacobsen, J. C. Evaluation of the performance of copy number variant prediction tools for the detection of deletions from whole genome sequencing data. J. Biomed. Inform. 94, 103174 (2019).
Article PubMed Google Scholar
Ohori, S. et al. Biallelic structural variations within FGF12 detected by long-read sequencing in epilepsy. Life Sci. Alliance https://doi.org/10.26508/lsa.202302025 (2023).
Coccaro, N. et al. Feasibility of optical genome mapping in cytogenetic diagnostics of hematological neoplasms: a new way to look at DNA. Diagnostics https://doi.org/10.3390/diagnostics13111841 (2023).
Pei, Y. et al. A comparison of structural variant calling from short-read and nanopore-based whole-genome sequencing using optical genome mapping as a benchmark. Genes https://doi.org/10.3390/genes15070925 (2024).
Chaisson, M. J. P. et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat. Commun. 10, 1784 (2019).
Article PubMed PubMed Central Google Scholar
Banu, S. et al. Enhanced resolution of optical genome mapping utilizing telomere-to-telomere reference in genetic disorders. Eur. J. Hum. Genet. 33, 956–959 (2025).
Article PubMed Google Scholar
Mostovoy, Y. et al. Resolution of ring chromosomes, Robertsonian translocations, and complex structural variants from long-read sequencing and telomere-to-telomere assembly. Am. J. Hum. Genet. 111, 2693–2706 (2024).
Article PubMed PubMed Central CAS Google Scholar
Carss, K. J. et al. Comprehensive rare variant analysis via whole-genome sequencing to determine the molecular pathology of inherited retinal disease. Am. J. Hum. Genet. 100, 75–90 (2017).
Article PubMed CAS Google Scholar
AlAbdi, L. et al. Beyond the exome: utility of long-read whole genome sequencing in exome-negative autosomal recessive diseases. Genome Med. 15, 114 (2023).
Article PubMed PubMed Central CAS Google Scholar
Fu, L. et al. Biallelic missense CEP55 variants cause prenatal MARCH syndrome. J. Hum. Genet. 70, 63–66 (2025).
Article PubMed CAS Google Scholar
Fromer, M. et al. Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am. J. Hum. Genet. 91, 597–607 (2012).
Article PubMed PubMed Central CAS Google Scholar
Ohori, S. et al. Complex chromosomal 6q rearrangements revealed by combined long-molecule genomics technologies. Genomics 116, 110894 (2024).
Article PubMed CAS Google Scholar
Miyatake, S. et al. Rapid and comprehensive diagnostic method for repeat expansion diseases using nanopore sequencing. NPJ Genom. Med. 7, 62 (2022).
Article PubMed PubMed Central CAS Google Scholar
Loose, M., Malla, S. & Stout, M. Real-time selective sequencing using nanopore technology. Nat. Methods 13, 751–754 (2016).
Article PubMed PubMed Central CAS Google Scholar
Mitsuhashi, S., Ohori, S., Katoh, K., Frith, M. C. & Matsumoto, N. A pipeline for complete characterization of complex germline rearrangements from long DNA reads. Genome Med. 12, 67 (2020).
Article PubMed PubMed Central CAS Google Scholar
Lei, M. et al. Long-read DNA sequencing fully characterized chromothripsis in a patient with Langer-Giedion syndrome and Cornelia de Lange syndrome-4. J. Hum. Genet. 65, 667–674 (2020).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We express our gratitude to the families for their involvement in this research. We further acknowledge the great technical assistance provided by Mr. Takafumi Miyama, Ms. Sayaka Sugimoto, Ms. Mai Sato, Ms. Nobuko Watanabe, and Ms. Kaori Takabe at the Department of Human Genetics, Yokohama City University Graduate School of Medicine. We also thank Bronwen Gardner, PhD, from Edanz (https://jp.edanz.com/ac) for editing a draft of this manuscript. The Japan Society for the Promotion of Science (JSPS) KAKENHI Grant-in-Aid for Scientific Research [grant numbers JP22K15901 (A.F.), JP23H02829 (S.M.), JP23H02877 (T.M.), JP23K07229 (Y.U.), JP23K15353 (N.T.), and JP24K02230 (N.M.)]; the Takeda Science Foundation (T.M. and N.M.); the Japan Agency for Medical Research and Development (AMED) [grant numbers JP25ek0109674, JP25ek0109760, JP25ek0109617, JP25ek0109648 and JP25ek0109677 (N.M.)]; and Kawano Masanori Memorial Public Interest Incorporated Foundation for Promotion of Pediatrics (S.M.) provided support for this work.

Author information

Authors and Affiliations

Department of Human Genetics, Yokohama City University Graduate School of Medicine, Yokohama, Japan
Li Fu, Satomi Mitsuhashi, Sachiko Ohori, Ken Saida, Yuta Inoue, Kohei Hamanaka, Naomi Tsuchida, Yuri Uchiyama, Eriko Koshimizu, Satoko Miyatake, Takeshi Mizuguchi, Atsushi Fujita & Naomichi Matsumoto
Department of Obstetrics and Gynecology, Juntendo University Faculty of Medicine, Tokyo, Japan
Li Fu, Toshifumi Suzuki, Hiromi Aoi, Rie Seyama & Atsuo Itakura
Genetics Unit, Instituto da Crianca, Hospital das Clinicas, Faculdade de Medicina, Universidade de Sao Paulo, Sao Paulo, Brazil
Chong Ae Kim
Department of Pediatrics, Kyorin University School of Medicine, Tokyo, Japan
Masatoshi Tokita & Yohane Miyata
Department of Medical Genetics, Osaka Women’s and Children’s Hospital, Osaka, Japan
Nobuhiko Okamoto
Department of Genetic Counseling, Asahikawa Medical University Hospital, Asahikawa, Japan
Yoshio Makita
Department of Pediatrics, Jichi Medical University, Shimotsuke, Japan
Hitoshi Osaka
Comprehensive Epilepsy Center, Seirei Hamamatsu General Hospital, Hamamatsu, Japan
Ayataka Fujimoto
Division of Neurology, Saitama Children’s Medical Center, Saitama, Japan
Atsuro Daida
Department of Pediatrics, Niigata University Graduate School of Medical and Dental Sciences, Niigata, Japan
Jun Nirei
Department of Pediatrics, St. Marianna University School of Medicine, Kawasaki, Japan
Noriko Udagawa
Department of Pediatrics, Graduate School of Biomedical and Health Sciences, Hiroshima University, Hiroshima, Japan
Seiichi Hayakawa
Deguchi Pediatric Clinic, Omura, Japan
Kimiko Deguchi
Department of Neuropediatrics, Tokyo Metropolitan Neurological Hospital, Tokyo, Japan
Mitsumasa Fukuda
Department of Pediatrics, Saitama Medical University Hospital, Saitama, Japan
Hiroshi Matsumoto
Department of Pediatrics, Iwate Medical University School of Medicine, Yahaba, Japan
Manami Akasaka
Division of Neonatology, St. Mary’s Hospital, Fukuoka, Japan
Junichiro Okada
Department of Neurology, Graduate School of Medical Sciences, Kumamoto University, Kumamoto, Japan
Yohei Misumi
Department of Pediatrics, Kumamoto University Hospital, Kumamoto, Japan
Jun Kido
Department of Obstetrics and Gynecology, Keiai Hospital, Saitama, Japan
Toshifumi Suzuki
Department of Clinical Genetics, Yokohama City University Hospital, Yokohama, Japan
Haruka Hamanoue, Satoko Miyatake & Naomichi Matsumoto
Department of Rare Disease Genomics, Yokohama City University Hospital, Yokohama, Japan
Naomi Tsuchida, Yuri Uchiyama & Naomichi Matsumoto
Department of Fundamental Cell Technology, Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
Naoki Harada

Authors

Li Fu
View author publications
Search author on:PubMed Google Scholar
Chong Ae Kim
View author publications
Search author on:PubMed Google Scholar
Masatoshi Tokita
View author publications
Search author on:PubMed Google Scholar
Yohane Miyata
View author publications
Search author on:PubMed Google Scholar
Nobuhiko Okamoto
View author publications
Search author on:PubMed Google Scholar
Yoshio Makita
View author publications
Search author on:PubMed Google Scholar
Hitoshi Osaka
View author publications
Search author on:PubMed Google Scholar
Ayataka Fujimoto
View author publications
Search author on:PubMed Google Scholar
Atsuro Daida
View author publications
Search author on:PubMed Google Scholar
Jun Nirei
View author publications
Search author on:PubMed Google Scholar
Noriko Udagawa
View author publications
Search author on:PubMed Google Scholar
Seiichi Hayakawa
View author publications
Search author on:PubMed Google Scholar
Kimiko Deguchi
View author publications
Search author on:PubMed Google Scholar
Mitsumasa Fukuda
View author publications
Search author on:PubMed Google Scholar
Hiroshi Matsumoto
View author publications
Search author on:PubMed Google Scholar
Manami Akasaka
View author publications
Search author on:PubMed Google Scholar
Junichiro Okada
View author publications
Search author on:PubMed Google Scholar
Yohei Misumi
View author publications
Search author on:PubMed Google Scholar
Jun Kido
View author publications
Search author on:PubMed Google Scholar
Toshifumi Suzuki
View author publications
Search author on:PubMed Google Scholar
Hiromi Aoi
View author publications
Search author on:PubMed Google Scholar
Rie Seyama
View author publications
Search author on:PubMed Google Scholar
Haruka Hamanoue
View author publications
Search author on:PubMed Google Scholar
Satomi Mitsuhashi
View author publications
Search author on:PubMed Google Scholar
Sachiko Ohori
View author publications
Search author on:PubMed Google Scholar
Ken Saida
View author publications
Search author on:PubMed Google Scholar
Yuta Inoue
View author publications
Search author on:PubMed Google Scholar
Kohei Hamanaka
View author publications
Search author on:PubMed Google Scholar
Naomi Tsuchida
View author publications
Search author on:PubMed Google Scholar
Yuri Uchiyama
View author publications
Search author on:PubMed Google Scholar
Eriko Koshimizu
View author publications
Search author on:PubMed Google Scholar
Satoko Miyatake
View author publications
Search author on:PubMed Google Scholar
Takeshi Mizuguchi
View author publications
Search author on:PubMed Google Scholar
Atsuo Itakura
View author publications
Search author on:PubMed Google Scholar
Naoki Harada
View author publications
Search author on:PubMed Google Scholar
Atsushi Fujita
View author publications
Search author on:PubMed Google Scholar
Naomichi Matsumoto
View author publications
Search author on:PubMed Google Scholar

Contributions

L.F. conceptualized and designed the study, reviewed the literature, analyzed the data, and drafted the manuscript; C.A.K., M.T., Y.Miy., N.O., Y.Ma., H.O., A.F., A.D., J.N., N.U., S.H., K.D., M.F., H.M., M.A., J.O., Y.Mis, J.K., T.S., H.A., R.S., H.H., S.Mit., S.O., K.S., Y.I., K.H., N.T., Y.U., E.K., S.Miy., T.M., A.I., and N.H. analyzed the data and revised the manuscript; A.F. and N.M. supervised all aspects of the study and revised the manuscript.

Corresponding author

Correspondence to Naomichi Matsumoto.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Supplementary Data 1 (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Fu, L., Kim, C.A., Tokita, M. et al. Completely resolved structural variants by optical genome mapping with adaptive sampling from CNV discovery. npj Genom. Med. 11, 26 (2026). https://doi.org/10.1038/s41525-026-00561-4

Download citation

Received: 14 August 2025
Accepted: 06 March 2026
Published: 26 March 2026
Version of record: 19 May 2026
DOI: https://doi.org/10.1038/s41525-026-00561-4

Subjects

Abstract

Similar content being viewed by others

Full characterization of unresolved structural variation through long-read sequencing and optical genome mapping

Optical genome mapping unveils hidden structural variants in neurodevelopmental disorders

High-resolution structural variant profiling of myelodysplastic syndromes by optical genome mapping uncovers cryptic aberrations of prognostic and therapeutic significance

Introduction

Results

Deletions (Cases 1, 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 14, 16, 19, and 25)

Duplication (Case 28)

Unbalanced translocations (Cases 15, 17, and 26)

Direct duplications (Cases 7, 18, and 24)

Duplication with inversion (Case 6)

Inverted duplications (Cases 23 and 27)

Intrachromosomal insertion events with inverted duplications (Cases 20–22)

Chromoplexy-like event in Case 29

Chromothripsis (Case 30)

Discussion

Methods

Patients

CNV analysis using whole ES

OGM

Targeted long-read sequencing using AS with GridION

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Information (download PDF )

Supplementary Data 1 (download XLSX )

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links