Identifying pathogenic variants in rare pediatric neurological diseases using exome sequencing

Komatsu, Kazuyuki; Kato, Mitsuhiro; Kubota, Kazuo; Fukumura, Shinobu; Yamada, Keitaro; Hori, Ikumi; Shimizu, Kenji; Miyamoto, Sachiko; Yamoto, Kaori; Hiraide, Takuya; Watanabe, Kazuki; Aoki, Shintaro; Furukawa, Shogo; Hayashi, Taiju; Isogai, Masaharu; Harasaki, Takuma; Nakashima, Mitsuko; Saitsu, Hirotomo

doi:10.1038/s41598-024-75020-0

Download PDF

Article
Open access
Published: 21 October 2024

Identifying pathogenic variants in rare pediatric neurological diseases using exome sequencing

Kazuyuki Komatsu¹,
Mitsuhiro Kato²,
Kazuo Kubota^3,4,
Shinobu Fukumura⁵,
Keitaro Yamada⁶,
Ikumi Hori^7,8,
Kenji Shimizu⁹,
Sachiko Miyamoto¹,
Kaori Yamoto¹,
Takuya Hiraide¹⁰,
Kazuki Watanabe¹¹,
Shintaro Aoki¹,
Shogo Furukawa¹,
Taiju Hayashi¹,
Masaharu Isogai¹,
Takuma Harasaki¹,
Mitsuko Nakashima¹ &
…
Hirotomo Saitsu¹

Scientific Reports volume 14, Article number: 24746 (2024) Cite this article

3667 Accesses
1 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Variant annotations are crucial for efficient identification of pathogenic variants. In this study, we retrospectively analyzed the utility of four annotation tools (allele frequency, ClinVar, SpliceAI, and Phenomatcher) in identifying 271 pathogenic single nucleotide and small insertion/deletion variants (SNVs/small indels). Although variant filtering based on allele frequency is essential for narrowing down on candidate variants, we found that 13 de novo pathogenic variants in autosomal dominant or X-linked dominant genes are registered in gnomADv4.0 or 54KJPN, with an allele frequency of less than 0.001%, suggesting that very rare variants in large cohort data can be pathogenic de novo variants. Notably, 38.4% candidate SNVs/small indels are registered in the ClinVar database as pathogenic or likely pathogenic, which highlights the significance of this database. SpliceAI can detect candidate variants affecting RNA splicing, leading to the identification of four variants located 11 to 50 bp away from the exon–intron boundary. Prioritization of candidate genes by proband phenotype using the PhenoMatcher module revealed that approximately 95% of the candidate genes had a maximum PhenoMatch score ≥ 0.6, suggesting the utility of phenotype-based variant prioritization. Our results suggest that a combination of multiple annotation tools and appropriate evaluation can improve the diagnosis of rare diseases.

Effective variant filtering and expected candidate variant yield in studies of rare human disease

Article Open access 15 July 2021

Uncovering recessive alleles in rare Mendelian disorders by genome sequencing of 174 individuals with monoallelic pathogenic variants

Article 27 September 2024

Structural variant calling and clinical interpretation in 6224 unsolved rare disease exomes

Article Open access 31 May 2024

Introduction

Comprehensive genetic analysis using next-generation sequencing has dramatically improved the diagnostic yield of genetic diseases. Approximately 50% of rare neurodevelopmental diseases have been diagnosed using various next-generation sequencing technologies, including exome or genome sequencing and transcriptome sequencing¹. The most commonly used method is exome sequencing, which targets the exons of all genes. Human coding exons contain approximately 17,000 single nucleotide variants (SNVs) and small insertion/deletions (Indels)². Exome sequencing can also detect variants in adjacent introns³. The SpliceAI score⁴ is used as computational evidence in decision tree for intronic variants using the American College of Medical Genetics/Association of Molecular Pathology (ACMG/AMP) framework⁵. However, there is no consensus on how many bases in an intron should be analyzed. Exome sequencing can also be used to detect copy number variations (CNVs) and, thereby, contributes to genetic diagnosis¹.

The first step in narrowing down on candidate variants in rare genetic diseases is to exclude common variants. A globally used database for excluding common variants is gnomAD⁶, the largest public open-access reference dataset for human genome allele frequencies (https://gnomad.broadinstitute.org/). It comprises 730,947 exomes and 76,215 genomes in its version 4.0, containing SNVs and indels less than 50 bp in length from all ethnicities. In Japan, the 54KJPN database, which comprises 54,302 genome sequencing data from Japanese individuals, has been curated (https://jmorp.megabank.tohoku.ac.jp/)⁷. Data from individuals affected by severe pediatric diseases and their first-degree relatives were excluded from gnomAD. However, pathogenic heterozygous variants for dominant severe pediatric diseases might still be present because of some factors, such as incomplete penetrance, imprinting, or mosaicism¹. In practice, rare variants with minor allele frequency equal to or less than 1% are commonly analyzed¹.

ClinVar is a freely accessible data archive provided by NCBI that offers information on the pathogenic significance and phenotypes of human genome variants⁸. It includes details on the submitter of the variant, classifications of the pathogenic significance of the variants, and other clinical data. Variants submitted to ClinVar are classified as pathogenic (P), likely pathogenic (LP), uncertain significance (VUS), conflicting classifications of pathogenicity, and under other categories. As of July 30, 2024, 369,269 P or LP variants among 2,983,625 total variants are registered (https://clinvarminer.genetics.utah.edu/variants-by-significance). The information in ClinVar is useful for identifying pathogenic variants in the exome; however, the extent to which ClinVar can contribute to diagnostic yield remains to be determined.

In this study, we retrospectively analyzed the utility of four annotation tools (allele frequency, ClinVar, SpliceAI, and Phenomatcher) in identifying pathogenic variants using exome sequencing data from probands with rare neurological diseases. Our findings should contribute to improving the diagnostic yield in exome sequencing analyses.

Materials and methods

Probands and initial exome analysis

Experimental protocols were approved by the Institutional Review Board Committee at Hamamatsu University School of Medicine (15–282, 17–163, and 20–207) and Showa University School of Medicine (G219-N and G220-N). Clinical information and peripheral blood samples were obtained after written informed consent was provided from all individuals and/or their legal guardians in agreement with the requirements of Japanese regulations. Using exome sequencing, we analyzed 463 probands with pediatric neurological diseases who were registered in our cohort between April 2016 and March 2024. Their siblings and parents were not included in the 463 probands. Trio-exome analysis was performed for 44 of the probands, including exome sequencing of their parents. The remaining 419 probands were analyzed using proband-only exome analysis. These periods varied for the exome capture and sequencing platforms: SureSelect Human All Exon V6 Kit (Agilent Technologies, Santa Clara, CA) and NextSeq500 (Illumina, San Diego, CA) paired-end sequencing (165 probands); xGen Exome Research Panel kit (IDT, Coralville IA) capture and NextSeq500 sequencing (174 probands) or DNBseq sequencing (33 probands); and Twist Exome 2.0 capture and NovaSeq6000 sequencing (91 probands). Some of these probands have been reported previously^{9,10,11,12,13,14,15,16,17}. Data processing was performed as described previously¹⁸. To explore the existence of CNVs, we used two CNV detection tools, exome hidden Markovmodel (XHMM)¹⁹ and jNord methods²⁰. The phenotypes of the probands were extracted based on information provided by the attending physicians. Based on the information, we classified the probands into groups with the most pronounced phenotype (Table 1).

Table 1 Clinical features and disease inheritance.

Full size table

Retrospective reanalysis of 242 probands possessing pathogenic SNVs/small indels

To evaluate the utility of four annotation tools (allele frequency, ClinVar, SpliceAI, and Phenomatcher) for identifying pathogenic variants, we retrospectively analyzed 242 exome datasets, excluding CNV analysis, as shown in Supplementary Figure S1. Sequenced reads were aligned to the reference genome (GRCh38) and deduplicated using the fq2bam software from Clara Parabricks v4.2.0 (NVIDIA, Santa Clara, CA). After generation of the base quality score recalibration report using the bqsr software, raw variants were called using the haplotypecaller (both from Parabricks v4.2.0, compatible with the Genome Analysis Toolkit version 4.3.0). The generated gVCF file for each proband was combined and quality-filtered using GLNexus (https://github.com/dnanexus-rnd/GLnexus). After removing the common variants in this cohort (Allele Frequency > 0.3) using BCFtools²¹, variants in exons and introns within 50 bp of the exon–intron boundary were annotated with ANNOVAR²², using the following databases: gnomADv4.0 exome (730,947 exomes) and 54KJPN for allele frequency, and ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/, version 2024-02-06). We added ClinVar annotation concerning allele ID (ALLELEID), preferred disease name (CLNDN), tag-value pairs of disease database name and identifier (CLNDISDB), review status for the variation ID (CLNREVSTAT), and clinical significance for this single variant (CLNSIG). These variants were also annotated with SpliceAI⁴. Additionally, we ranked the candidate genes with scores based on the Human Phenotype Ontology terms using the PhenoMatcher module (https://github.com/liu-lab/exome_reanalysis)²³. The most informative common ancestor matrix used for this analysis, created in March 2024 using three datasets (hp.obo, phenotype.hpoa, and genes_to_disease.txt; version 2024-02-08), was downloaded from the human phenotypic ontology webpage (https://hpo.jax.org/). Finally, we also annotated the phenotype information extracted from genemap2.txt, which can be downloaded from Online Mendelian Inheritance in the Man web site (https://www.omim.org/). This information helps in easily checking the names of diseases caused by the genes and their inheritance patterns.

Evaluation of variant pathogenicity

The definition of pathogenic variant was “Pathogenic” or “Likely pathogenic” according to the ACMG/AMP 2015 guideline²⁴ and previously reported pathogenic variants. We confirmed that the phenotypes of the probands were consistent with those mentioned in previous reports by utilizing phenotype information from OMIM and CLNDN. All pathogenic SNVs and CNVs were confirmed using Sanger sequencing, performed on an ABI 3130xl Genetic Analyzer (Applied Biosystems, Foster City, CA), and quantitative polymerase chain reaction, which was performed on a StepOnePlus system (Applied Biosystems), respectively. For confirmation of de novo variants, we performed Trio-exome or Sanger sequencing using proband and parental samples, and confirmed the biological parentage by analyzing 10 microsatellite markers. As an exception, we included a candidate pathogenic intronic L1CAM variant found in a proband with consistent phenotype and inheritance, although its RNA analysis has not yet been performed.

Results

The median depth of coverage for the 463 exomes was 78.49 (range: 34.26–309.46). Among them, pathogenic variants were detected in 270 probands (58.3%, Fig. 1). We could detect pathogenic variants—238 probands possessed SNVs and small indels, 28 probands possessed CNVs, and 4 probands possessed both SNVs and CNVs. The information for all the identified pathogenic variants is presented in Supplementary Tables S1 and S2. The most common phenotype was brain malformation (n = 144, 54.8%, Table 1). In 10 probands, dual phenotypes caused by multiple pathogenic variants were identified. The majority of disease inheritance was autosomal dominant (n = 167, 59.6%). A total of 271 SNVs and small indels were detected as pathogenic variants (Fig. 2a). TUBA1A variants were the most frequent among these. Additionally, 33 CNVs were found (10.9%, Fig. 2b). CNVs were observed in 9% of the probands with brain malformation, in 11% of the probands with seizure, in 13% of the probands with abnormal myelination, in 15% of the probands with neurodevelopmental delay, and in 43% of the probands with ataxia. However, no CNVs were detected in cases with involuntary movement, neuromuscular disease, or spastic paraplegia. Most pathogenic CNV regions contained genes or regions with a haploinsufficiency or triplosensitivity score of 3 (Supplementary Table S2). In the probands with brain malformation, which was most common phenotype in our cohort, TUBA1A was the most frequently observed gene, identified in 12 probands, including one possessing both the TUBA1A and SCN8A pathogenic variants. For seizure, SCN1A, which was found in 7 cases, was the most common. TUBB4A, SPTAN1, POLR3A, COL4A1, and CLCN2 variants were each observed in two probands with abnormal myelination (Fig. 2c).

To assess the utility of gnomADv4.0 or 54KJPN in identifying de novo variants in probands, we evaluated the allele frequency of these variants in the databases. A total of 162 de novo variants in autosomal dominant or X-linked dominant genes were confirmed in 164 probands, with one proband having two de novo variants and one proband having three. Five recurrent de novo variants were also observed. Among 162 de novo variants, 13 variants (8.0%) were found in the databases in 14 probands, with an identical variant in two unrelated probands (Table 2). Specifically, two variants were registered in 54KJPN, 11 in gnomADv4.0 exome, and one in both 54KJPN and gnomADv4.0 exome, all with an allele frequency less than 0.001%. These data indicate that pathogenic de novo variants could be observed, albeit very rarely, in the large public cohort databases.

Table 2 De novo variants registered in 54KJPN and gnomADv4.

Full size table

Next, we evaluated the utility of annotation based on ClinVar pathogenicity classifications. Among the SNVs and small indels identified in this study, 38.4% were registered in ClinVar with P or LP classification (Fig. 3a), which underscores the immense utility of this database. Variants unregistered in ClinVar accounted for 48.7% of the variants.

Among 24 intronic variants, SpliceAI could predict aberrant splicing with delta score equal or above 0.2 in 22 variants (91.7%). Among 22 variants, only nine variants were registered as P or LP in ClinVar (Fig. 3b). Notably, we found four variants that were located more than 10 bp away from the exon–intron boundary and predicted aberrant splicing using SpliceAI (Table 3, and Supplementary Figure S2). Among these variants, the splicing change in WDR37, CEP290 has been confirmed in previous studies^9,25. Three of four variants have been registered as P or LP in ClinVar, including a WDR37 variant, which was registered by us⁹.

Table 3 Intronic variant which is affected splicing within 11–50 bp from exon.

Full size table

We also evaluated the utility of a phenotype annotation tool, the PhenoMatcher module (https://github.com/liu-lab/exome_reanalysis). Approximately 95% of the candidate genes had maximum PhenoMatch scores of 0.6 or above, and 85.1% of the candidate genes had scores of 1.0 or above (Fig. 3c). Because the maximum PhenoMatch score of 0.3 was used as a threshold in a previous study²³, these data suggest a good correlation between genes and phenotypes, and demonstrate the utility of prioritizing candidate genes.

In this analysis, we combined the gVCF files of probands using GLNexus. In this process, a FOXG1 variant was filtered out (Supplementary Figure S3), which was called in the gVCF. Multisample calling is recommended in GATK best-practice; however, it should be borne in mind that true but low-quality calls might be excluded in the quality filtering step.

Discussion

In this study, we found pathogenic SNVs, small indels, and CNVs in 270 of 463 probands with rare pediatric neurological diseases. Among the identified pathogenic variants, CNVs were observed in approximately 10% of the probands (Fig. 1). Intragenic CNVs were reported to account for 9.8% of the pathogenic or likely pathogenic variants identified through a panel analysis of Mendelian disease genes in a previous study²⁶. In neurological disease cohorts, CNVs detected based on exome sequencing data accounted for 3.8%, 2%, and 1.2% of the variants in neuropathies, movement disorders, and muscle diseases, respectively²⁷. In our cohort, the CNV detection rate for ataxia was 43%, which is higher compared with the 1% CNV rate reported among the 36 known genes associated with cerebellar ataxia²⁸. This discrepancy may be attributed to differences in cohort characteristics, disease classification criteria, and the small samples size in the present study; however, it is noteworthy that CNVs contribute to the improved diagnostic rate of ataxia. These results confirm that exome sequencing, including CNV analysis, is useful in the genetic diagnosis of pediatric neurological diseases^1,29.

We retrospectively evaluated the impact of four annotations for identifying pathogenic variants in probands with pediatric neurological diseases. To date, approximately 3 million pathogenic variants have been registered in the ClinVar database. However, 132 out of the 271 pathogenic variants in our cohort were not registered in this database. On the contrary, we also found that ClinVar annotation is of immense value, as 38.1% of the candidate variants had been registered in the ClinVar database as pathogenic or likely pathogenic. These variants could be easily identified by checking the ClinVar annotation, which reduces the burden of manual analysis. Because the ClinVar database is rapidly growing, utilizing the latest information may increase diagnostic yield. For example, the HSD17B4 c.350 A > T variant (ID: 18081) was not registered in ClinVar at the time of publication of the previous report¹⁷, but has been registered as “pathogenic” in the latest ClinVar. Because the VCF file format information in ClinVar is updated monthly, the ClinVar annotations should be regularly updated during (re-)analysis.

Notably, four intronic variants have been identified as P/LP or as a strong candidate. These variants were located between positions 11 and 50 bp away from the exon–intron boundary. SpliceAI is highly sensitive in predicting cryptic new donor or acceptor sites and the loss of canonical splice sites³⁰. Delta scores for either splice site gain or loss were 0.95 or above in three variants, and 0.44 in one variant (Table 2), where three of the four variants being registered as P or LP in ClinVar, highlighting the usefulness of combining ClinVar and SpliceAI annotations for intronic variants. Notably, a L1CAM variant (NM_001278116.2:c.1124-24T > G) was not registered in ClinVar; thus, the SpliceAI annotation could exclusively contribute to the possible genetic diagnosis of this proband, although RNA analysis should be performed. Depending on the capture efficiency, expanding analysis region of introns beyond 50 bp from the exon–intron boundary may increase the detection of pathogenic variants in undiagnosed cases. However, our analysis showed that the number of pathogenic intronic variants decreased from 20 within 10 bp to four in the 11–50 bp range, suggesting that the further a variant is from the canonical splice site, the less likely it is to impact splicing. Additionally, as the analysis range of introns expands, the accuracy of called variants decreases³, and analysis time and cost may increase. Considering these factors, our findings suggest that extending the analysis range to 50 bp is practically useful for detecting pathogenic intronic variants in the routine pipeline of exome sequencing in combination with ClinVar and SpliceAI annotations.

We found that 13 de novo variants in 14 probands, with very low allele frequencies, were registered in large public cohort databases. Nine variants were registered as pathogenic or likely pathogenic in ClinVar, but two were classified as having conflicting classifications of pathogenicity and two were unregistered. Pathogenic heterozygous variants for dominant severe pediatric diseases might still be observed due to factors, such as incomplete penetrance, imprinting, or mosaicism¹. TUBB3 variants cause fibrosis in extraocular muscles and cortical dysplasia, which have complete penetrance with a broad spectrum of phenotypes, including mild developmental delay³¹. Therefore, we believe that the broad disease phenotypes of TUBB3-related disorders may lead to the identification of one individual harboring the TUBB3 (c.1070 C > T) variant in 54KJPN. On the contrary, somatic mosaicism may be involved in the case of FOXG1 variants. The c.250del FOXG1 variant was registered as pathogenic with three stars in ClinVar, but was found in nine individuals in gnomADv4.1. However, the allele balance of eight variant carriers was in the 0.2–0.25 range, and one variant carrier was in the 0.25–0.3 range. Although our case also shows an allele balance of 0.33, these findings suggest that c.250del could occur as a somatic variant. Therefore, we should be mindful of the fact that very rare variants in large cohort data can be pathogenic de novo variants.

The numbers of genes responsible for Mendelian disorders is continuously increasing. Therefore, updating annotations concerning the gene–disease–phenotype associations will be essential to identify pathogenic variants in recently reported genes in exome (re)analysis³². In this study, we utilized the PhenoMacher module for prioritizing candidate genes. This program allows for the dynamic incorporation of new knowledge regarding the gene–disease–phenotype associations by updating the most informative common ancestor matrix, which can be created with three datasets (hp.obo, phenotype.hpoa, and genes_to_phenotype.txt; available from the human phenotypic ontology webpage). Therefore, by updating matrix using the three updated datasets, the risk of overlooking recently reported genes can be minimized. Although the effectiveness of PhenoMatcher in identifying the causative genes in pediatric neurological diseases has not been reported, our cohort, with 95% of the probands having a score of 0.6 or higher, could provide valuable information for determining the cutoff in pediatric neurological diseases. In practice, combining these annotations with predictions of the effects of genetic variants, such as BayesDel³³, CADD³⁴, PolyPhen-2³⁵, or REVEL³⁶, may facilitate the identification of pathogenicity, especially for variants not annotated in ClinVar³⁷.

The limitations of this study include the small sample size, which does not encompass the entire spectrum of pediatric neurological diseases, and the potential for selection bias considering the cohort consists only of probands collected in our laboratory. Additionally, only a limited number of annotation tools were utilized.

In summary, evaluation of the utility of the various annotation tools in identifying pathogenic variants suggests that combination of multiple annotations, such as ClinVar and SpliceAI score, can improve the diagnostic yield of rare diseases. Careful examination is required to avoid overlooking intronic and very rare de novo variants in the general populations.

Data availability

All data obtained in this study are available from the corresponding author (H.S.) upon reasonable request.

References

Lee, H. & Nelson, S. F. The frontiers of sequencing in undiagnosed neurodevelopmental diseases. Curr. Opin. Genet. Dev. 65, 76–83 (2020).
Article PubMed PubMed Central CAS Google Scholar
Ng, S. B. et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461(7261), 272–276 (2009).
Article ADS PubMed PubMed Central CAS Google Scholar
Guo, Y. et al. Exome sequencing generates high quality data in non-target regions. BMC Genom. 13, 194 (2012).
Article CAS Google Scholar
Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176(3), 535–548e524 (2019).
Article PubMed CAS Google Scholar
Walker, L. C. et al. Using the ACMG/AMP framework to capture evidence related to predicted and observed impact on splicing: Recommendations from the ClinGen SVI splicing subgroup. Am. J. Hum. Genet. 110(7), 1046–1067 (2023).
Article PubMed PubMed Central CAS Google Scholar
Chen, S. et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature 625(7993), 92–100 (2024).
Article ADS PubMed CAS Google Scholar
Tadaka, S. et al. jMorp: Japanese multi-omics reference panel update report 2023. Nucleic Acids Res. 52(D1), D622–D632 (2024).
Article PubMed Google Scholar
Landrum, M. J. et al. ClinVar: Public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42(Database issue), D980–985. https://doi.org/10.1093/nar/gkt1113 (2014).
Samejima, M., Nakashima, M., Shibasaki, J., Saitsu, H. & Kato, M. Splicing variant of WDR37 in a case of neurooculocardiogenitourinary syndrome. Brain Dev. 46(3), 154–159 (2024).
Article PubMed CAS Google Scholar
Furukawa, S. et al. Two novel heterozygous variants in ATP1A3 cause movement disorders. Hum. Genome Var. 9(1), 7 (2022).
Article PubMed PubMed Central CAS Google Scholar
Komatsu, K., Fukumura, S., Minagawa, K., Nakashima, M. & Saitsu, H. A new case of concurrent existence of PRRT2-associated paroxysmal movement disorders with c.649dup variant and 16p11.2 microdeletion syndrome. Brain Dev. 44(7), 474–479 (2022).
Article PubMed CAS Google Scholar
Miyamoto, S. et al. Comprehensive genetic analysis confers high diagnostic yield in 16 Japanese patients with corpus callosum anomalies. J. Hum. Genet. 66(11), 1061–1068 (2021).
Article PubMed CAS Google Scholar
Miyamoto, S. et al. A boy with biallelic frameshift variants in TTC5 and brain malformation resembling tubulinopathies. J. Hum. Genet. 66(12), 1189–1192 (2021).
Article PubMed CAS Google Scholar
Miyamoto, S., Nakashima, M., Fukumura, S., Kumada, S. & Saitsu, H. An intronic GNAO1 variant leading to in-frame insertion cause movement disorder controlled by deep brain stimulation. Neurogenetics 23(2), 129–135 (2022).
Article PubMed CAS Google Scholar
Miyamoto, S. et al. A case of de novo splice site variant in SLC35A2 showing developmental delays, spastic paraplegia, and delayed myelination. Mol. Genet. Genomic Med. 7(8), e814 (2019).
Negishi, Y. et al. SCN8A-related developmental and epileptic encephalopathy with ictal asystole requiring cardiac pacemaker implantation. Brain Dev. 43(7), 804–808 (2021).
Article PubMed CAS Google Scholar
Yamamoto, A. et al. Novel HSD17B4 variants cause progressive leukodystrophy in childhood: Case report and literature review. Child. Neurol. Open. 8(x211048613), 2329048. https://doi.org/10.1177/2329048x211048613 (2021).
Article Google Scholar
Watanabe, K. et al. Identification of two novel de novo TUBB variants in cases with brain malformations: Case reports and literature review. J. Hum. Genet. 66(12), 1193–1197 (2021).
Article PubMed CAS Google Scholar
Fromer, M. et al. Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am. J. Hum. Genet. 91(4), 597–607 (2012).
Article PubMed PubMed Central CAS Google Scholar
Uchiyama, Y. et al. Efficient detection of copy-number variations using exome data: Batch- and sex-based analyses. Hum. Mutat. 42(1), 50–65 (2021).
Article PubMed CAS Google Scholar
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10(2). https://doi.org/10.1093/gigascience/giab008 (2021).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38(16), e164 (2010).
Liu, P. et al. Reanalysis of clinical exome sequencing data. N. Engl. J. Med. 380(25), 2478–2480 (2019).
Article PubMed PubMed Central Google Scholar
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17(5), 405–424 (2015).
Article PubMed PubMed Central Google Scholar
Tsurusaki, Y. et al. The diagnostic utility of exome sequencing in Joubert syndrome and related disorders. J. Hum. Genet. 58(2), 113–115 (2013).
Article PubMed CAS Google Scholar
Truty, R. et al. Prevalence and properties of intragenic copy-number variation in mendelian disease genes. Genet. Med. 21(1), 114–123 (2019).
Article PubMed CAS Google Scholar
Pennings, M. et al. Copy number variants from 4800 exomes contribute to ~ 7% of genetic diagnoses in movement disorders, muscle disorders and neuropathies. Eur. J. Hum. Genet. 31(6), 654–662 (2023).
Article PubMed PubMed Central CAS Google Scholar
Ghorbani, F. et al. Copy number variant analysis of spinocerebellar ataxia genes in a cohort of Dutch patients with cerebellar ataxia. Neurol. Genet. 9(1), e200050. https://doi.org/10.1212/NXG.0000000000200050 (2023).
Article PubMed PubMed Central CAS Google Scholar
Srivastava, S. et al. Meta-analysis and multidisciplinary consensus statement: Exome sequencing is a first-tier clinical diagnostic test for individuals with neurodevelopmental disorders. Genet. Med. 21(11), 2413–2421 (2019).
Article PubMed PubMed Central Google Scholar
Barbosa, P., Savisaar, R., Carmo-Fonseca, M. & Fonseca, A. Computational prediction of human deep intronic variation. Gigascience 12. https://doi.org/10.1093/gigascience/giad085 (2022).
Poirier, K. et al. Mutations in the neuronal ss-tubulin subunit TUBB3 result in malformation of cortical development and neuronal migration defects. Hum. Mol. Genet. 19(22), 4462–4473 (2010).
Article PubMed PubMed Central CAS Google Scholar
Tan, N. B. et al. Evaluating systematic reanalysis of clinical genomic data in rare disease from single center experience and literature review. Mol. Genet. Genomic Med. 8(11), e1508. https://doi.org/10.1002/mgg3.1508 (2020).
Article PubMed PubMed Central Google Scholar
Feng, B. J. PERCH: A unified framework for disease gene prioritization. Hum. Mutat. 38(3), 243–251 (2017).
Article PubMed PubMed Central CAS Google Scholar
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: Predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47(D1), D886–d894 (2019).
Article PubMed CAS Google Scholar
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7(4), 248–249 (2010).
Article PubMed PubMed Central CAS Google Scholar
Ioannidis, N. M. et al. REVEL: An ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 99(4), 877–885 (2016).
Article PubMed PubMed Central CAS Google Scholar
König, E., Rainer, J. & Domingues, F. S. Computational assessment of feature combinations for pathogenic variant prediction. Mol. Genet. Genom. Med. 4(4), 431–446 (2016).
Article Google Scholar

Download references

Acknowledgements

We would like to thank the patients for participating in this study and the attending physicians for referring clinical data. This work was supported in part by the Japan Society for the Promotion of Science, KAKENHI (Grant number JP20H03641 and JP23H02875) (H.S.), the Japan Agency for Medical Research and Development (AMED) (JP23ek0109549, JP23ek0109674, and JP23ek0109637) (H.S.), the Takeda Science Foundation, and HUSM Grant-in-Aid from Hamamatsu University School of Medicine (M.N. and H.S.).

Author information

Authors and Affiliations

Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, 431- 3192, Japan
Kazuyuki Komatsu, Sachiko Miyamoto, Kaori Yamoto, Shintaro Aoki, Shogo Furukawa, Taiju Hayashi, Masaharu Isogai, Takuma Harasaki, Mitsuko Nakashima & Hirotomo Saitsu
Department of Pediatrics, Showa University School of Medicine, Tokyo, 142-8555, Japan
Mitsuhiro Kato
Department of Pediatrics, Gifu University Graduate School of Medicine, Gifu, 501-1194, Japan
Kazuo Kubota
Division of Clinical Genetics, Gifu University Hospital, Gifu, 501-1194, Japan
Kazuo Kubota
Department of Pediatrics, Sapporo Medical University School of Medicine, Sapporo, 060-8556, Japan
Shinobu Fukumura
Department of Pediatric Neurology, Central Hospital, Aichi Developmental Disability Center, Kasugai, 486-0392, Japan
Keitaro Yamada
Department of Pediatrics and Neonatology, Nagoya City University Graduate School of Medical Sciences, Nagoya, 467-8601, Japan
Ikumi Hori
Department of Pediatrics, Aichi Prefectural Welfare Federation of Agricultural Cooperatives Kainan Hospital, Yatomi, 498-8502, Japan
Ikumi Hori
Division of Medical Genetics, Shizuoka Children’s Hospital, Shizuoka, 420-8660, Japan
Kenji Shimizu
Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, 431-3192, Japan
Takuya Hiraide
Department of Neurology, Hamamatsu University School of Medicine, Hamamatsu, 431-3192, Japan
Kazuki Watanabe

Authors

Kazuyuki Komatsu
View author publications
Search author on:PubMed Google Scholar
Mitsuhiro Kato
View author publications
Search author on:PubMed Google Scholar
Kazuo Kubota
View author publications
Search author on:PubMed Google Scholar
Shinobu Fukumura
View author publications
Search author on:PubMed Google Scholar
Keitaro Yamada
View author publications
Search author on:PubMed Google Scholar
Ikumi Hori
View author publications
Search author on:PubMed Google Scholar
Kenji Shimizu
View author publications
Search author on:PubMed Google Scholar
Sachiko Miyamoto
View author publications
Search author on:PubMed Google Scholar
Kaori Yamoto
View author publications
Search author on:PubMed Google Scholar
Takuya Hiraide
View author publications
Search author on:PubMed Google Scholar
Kazuki Watanabe
View author publications
Search author on:PubMed Google Scholar
Shintaro Aoki
View author publications
Search author on:PubMed Google Scholar
Shogo Furukawa
View author publications
Search author on:PubMed Google Scholar
Taiju Hayashi
View author publications
Search author on:PubMed Google Scholar
Masaharu Isogai
View author publications
Search author on:PubMed Google Scholar
Takuma Harasaki
View author publications
Search author on:PubMed Google Scholar
Mitsuko Nakashima
View author publications
Search author on:PubMed Google Scholar
Hirotomo Saitsu
View author publications
Search author on:PubMed Google Scholar

Contributions

H.S.: conceptualization. M.N.: genetic data curation. M.K., K.Ku., S.F., K.Y., I.H., and S.K.: recruitment of patients and their families and phenotype collection. T.H. and K.Ko.: classification of the proband phenotype as pediatrician and pediatric neurologist. K.Ko. and H.S.: genetic data curation and writing of the original draft. All authors: writing, review, and editing.

Corresponding author

Correspondence to Hirotomo Saitsu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary Material 2

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Komatsu, K., Kato, M., Kubota, K. et al. Identifying pathogenic variants in rare pediatric neurological diseases using exome sequencing. Sci Rep 14, 24746 (2024). https://doi.org/10.1038/s41598-024-75020-0

Download citation

Received: 12 June 2024
Accepted: 01 October 2024
Published: 21 October 2024
Version of record: 21 October 2024
DOI: https://doi.org/10.1038/s41598-024-75020-0