Abstract
Variant annotations are crucial for efficient identification of pathogenic variants. In this study, we retrospectively analyzed the utility of four annotation tools (allele frequency, ClinVar, SpliceAI, and Phenomatcher) in identifying 271 pathogenic single nucleotide and small insertion/deletion variants (SNVs/small indels). Although variant filtering based on allele frequency is essential for narrowing down on candidate variants, we found that 13 de novo pathogenic variants in autosomal dominant or X-linked dominant genes are registered in gnomADv4.0 or 54KJPN, with an allele frequency of less than 0.001%, suggesting that very rare variants in large cohort data can be pathogenic de novo variants. Notably, 38.4% candidate SNVs/small indels are registered in the ClinVar database as pathogenic or likely pathogenic, which highlights the significance of this database. SpliceAI can detect candidate variants affecting RNA splicing, leading to the identification of four variants located 11 to 50 bp away from the exon–intron boundary. Prioritization of candidate genes by proband phenotype using the PhenoMatcher module revealed that approximately 95% of the candidate genes had a maximum PhenoMatch score ≥ 0.6, suggesting the utility of phenotype-based variant prioritization. Our results suggest that a combination of multiple annotation tools and appropriate evaluation can improve the diagnosis of rare diseases.
Similar content being viewed by others
Introduction
Comprehensive genetic analysis using next-generation sequencing has dramatically improved the diagnostic yield of genetic diseases. Approximately 50% of rare neurodevelopmental diseases have been diagnosed using various next-generation sequencing technologies, including exome or genome sequencing and transcriptome sequencing1. The most commonly used method is exome sequencing, which targets the exons of all genes. Human coding exons contain approximately 17,000 single nucleotide variants (SNVs) and small insertion/deletions (Indels)2. Exome sequencing can also detect variants in adjacent introns3. The SpliceAI score4 is used as computational evidence in decision tree for intronic variants using the American College of Medical Genetics/Association of Molecular Pathology (ACMG/AMP) framework5. However, there is no consensus on how many bases in an intron should be analyzed. Exome sequencing can also be used to detect copy number variations (CNVs) and, thereby, contributes to genetic diagnosis1.
The first step in narrowing down on candidate variants in rare genetic diseases is to exclude common variants. A globally used database for excluding common variants is gnomAD6, the largest public open-access reference dataset for human genome allele frequencies (https://gnomad.broadinstitute.org/). It comprises 730,947 exomes and 76,215 genomes in its version 4.0, containing SNVs and indels less than 50 bp in length from all ethnicities. In Japan, the 54KJPN database, which comprises 54,302 genome sequencing data from Japanese individuals, has been curated (https://jmorp.megabank.tohoku.ac.jp/)7. Data from individuals affected by severe pediatric diseases and their first-degree relatives were excluded from gnomAD. However, pathogenic heterozygous variants for dominant severe pediatric diseases might still be present because of some factors, such as incomplete penetrance, imprinting, or mosaicism1. In practice, rare variants with minor allele frequency equal to or less than 1% are commonly analyzed1.
ClinVar is a freely accessible data archive provided by NCBI that offers information on the pathogenic significance and phenotypes of human genome variants8. It includes details on the submitter of the variant, classifications of the pathogenic significance of the variants, and other clinical data. Variants submitted to ClinVar are classified as pathogenic (P), likely pathogenic (LP), uncertain significance (VUS), conflicting classifications of pathogenicity, and under other categories. As of July 30, 2024, 369,269 P or LP variants among 2,983,625 total variants are registered (https://clinvarminer.genetics.utah.edu/variants-by-significance). The information in ClinVar is useful for identifying pathogenic variants in the exome; however, the extent to which ClinVar can contribute to diagnostic yield remains to be determined.
In this study, we retrospectively analyzed the utility of four annotation tools (allele frequency, ClinVar, SpliceAI, and Phenomatcher) in identifying pathogenic variants using exome sequencing data from probands with rare neurological diseases. Our findings should contribute to improving the diagnostic yield in exome sequencing analyses.
Materials and methods
Probands and initial exome analysis
Experimental protocols were approved by the Institutional Review Board Committee at Hamamatsu University School of Medicine (15–282, 17–163, and 20–207) and Showa University School of Medicine (G219-N and G220-N). Clinical information and peripheral blood samples were obtained after written informed consent was provided from all individuals and/or their legal guardians in agreement with the requirements of Japanese regulations. Using exome sequencing, we analyzed 463 probands with pediatric neurological diseases who were registered in our cohort between April 2016 and March 2024. Their siblings and parents were not included in the 463 probands. Trio-exome analysis was performed for 44 of the probands, including exome sequencing of their parents. The remaining 419 probands were analyzed using proband-only exome analysis. These periods varied for the exome capture and sequencing platforms: SureSelect Human All Exon V6 Kit (Agilent Technologies, Santa Clara, CA) and NextSeq500 (Illumina, San Diego, CA) paired-end sequencing (165 probands); xGen Exome Research Panel kit (IDT, Coralville IA) capture and NextSeq500 sequencing (174 probands) or DNBseq sequencing (33 probands); and Twist Exome 2.0 capture and NovaSeq6000 sequencing (91 probands). Some of these probands have been reported previously9,10,11,12,13,14,15,16,17. Data processing was performed as described previously18. To explore the existence of CNVs, we used two CNV detection tools, exome hidden Markovmodel (XHMM)19 and jNord methods20. The phenotypes of the probands were extracted based on information provided by the attending physicians. Based on the information, we classified the probands into groups with the most pronounced phenotype (Table 1).
Retrospective reanalysis of 242 probands possessing pathogenic SNVs/small indels
To evaluate the utility of four annotation tools (allele frequency, ClinVar, SpliceAI, and Phenomatcher) for identifying pathogenic variants, we retrospectively analyzed 242 exome datasets, excluding CNV analysis, as shown in Supplementary Figure S1. Sequenced reads were aligned to the reference genome (GRCh38) and deduplicated using the fq2bam software from Clara Parabricks v4.2.0 (NVIDIA, Santa Clara, CA). After generation of the base quality score recalibration report using the bqsr software, raw variants were called using the haplotypecaller (both from Parabricks v4.2.0, compatible with the Genome Analysis Toolkit version 4.3.0). The generated gVCF file for each proband was combined and quality-filtered using GLNexus (https://github.com/dnanexus-rnd/GLnexus). After removing the common variants in this cohort (Allele Frequency > 0.3) using BCFtools21, variants in exons and introns within 50 bp of the exon–intron boundary were annotated with ANNOVAR22, using the following databases: gnomADv4.0 exome (730,947 exomes) and 54KJPN for allele frequency, and ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/, version 2024-02-06). We added ClinVar annotation concerning allele ID (ALLELEID), preferred disease name (CLNDN), tag-value pairs of disease database name and identifier (CLNDISDB), review status for the variation ID (CLNREVSTAT), and clinical significance for this single variant (CLNSIG). These variants were also annotated with SpliceAI4. Additionally, we ranked the candidate genes with scores based on the Human Phenotype Ontology terms using the PhenoMatcher module (https://github.com/liu-lab/exome_reanalysis)23. The most informative common ancestor matrix used for this analysis, created in March 2024 using three datasets (hp.obo, phenotype.hpoa, and genes_to_disease.txt; version 2024-02-08), was downloaded from the human phenotypic ontology webpage (https://hpo.jax.org/). Finally, we also annotated the phenotype information extracted from genemap2.txt, which can be downloaded from Online Mendelian Inheritance in the Man web site (https://www.omim.org/). This information helps in easily checking the names of diseases caused by the genes and their inheritance patterns.
Evaluation of variant pathogenicity
The definition of pathogenic variant was “Pathogenic” or “Likely pathogenic” according to the ACMG/AMP 2015 guideline24 and previously reported pathogenic variants. We confirmed that the phenotypes of the probands were consistent with those mentioned in previous reports by utilizing phenotype information from OMIM and CLNDN. All pathogenic SNVs and CNVs were confirmed using Sanger sequencing, performed on an ABI 3130xl Genetic Analyzer (Applied Biosystems, Foster City, CA), and quantitative polymerase chain reaction, which was performed on a StepOnePlus system (Applied Biosystems), respectively. For confirmation of de novo variants, we performed Trio-exome or Sanger sequencing using proband and parental samples, and confirmed the biological parentage by analyzing 10 microsatellite markers. As an exception, we included a candidate pathogenic intronic L1CAM variant found in a proband with consistent phenotype and inheritance, although its RNA analysis has not yet been performed.
Results
The median depth of coverage for the 463 exomes was 78.49 (range: 34.26–309.46). Among them, pathogenic variants were detected in 270 probands (58.3%, Fig. 1). We could detect pathogenic variants—238 probands possessed SNVs and small indels, 28 probands possessed CNVs, and 4 probands possessed both SNVs and CNVs. The information for all the identified pathogenic variants is presented in Supplementary Tables S1 and S2. The most common phenotype was brain malformation (n = 144, 54.8%, Table 1). In 10 probands, dual phenotypes caused by multiple pathogenic variants were identified. The majority of disease inheritance was autosomal dominant (n = 167, 59.6%). A total of 271 SNVs and small indels were detected as pathogenic variants (Fig. 2a). TUBA1A variants were the most frequent among these. Additionally, 33 CNVs were found (10.9%, Fig. 2b). CNVs were observed in 9% of the probands with brain malformation, in 11% of the probands with seizure, in 13% of the probands with abnormal myelination, in 15% of the probands with neurodevelopmental delay, and in 43% of the probands with ataxia. However, no CNVs were detected in cases with involuntary movement, neuromuscular disease, or spastic paraplegia. Most pathogenic CNV regions contained genes or regions with a haploinsufficiency or triplosensitivity score of 3 (Supplementary Table S2). In the probands with brain malformation, which was most common phenotype in our cohort, TUBA1A was the most frequently observed gene, identified in 12 probands, including one possessing both the TUBA1A and SCN8A pathogenic variants. For seizure, SCN1A, which was found in 7 cases, was the most common. TUBB4A, SPTAN1, POLR3A, COL4A1, and CLCN2 variants were each observed in two probands with abnormal myelination (Fig. 2c).
Overview of exome sequencing results for 463 probands with pediatric neurological diseases. Exome sequencing was performed for 463 probands, and pathogenic variants including copy number variants (CNVs) were found in 270 probands. Reanalysis was performed for 242 probands possessing pathogenic single nucleotide variants (SNVs)/small indels.
Variant types and genes in this study. (a) Distribution of the variant number per disease-causing genes. The number in parentheses indicates the number of genes. The genes with ≥ 4 detected variants are listed. (b) Number of pathogenic single nucleotide variants (SNVs), indels, and copy number variants (CNVs). (c) Proportion of causative genes and CNVs for three major phenotypes. The numbers indicate the number of variants.
To assess the utility of gnomADv4.0 or 54KJPN in identifying de novo variants in probands, we evaluated the allele frequency of these variants in the databases. A total of 162 de novo variants in autosomal dominant or X-linked dominant genes were confirmed in 164 probands, with one proband having two de novo variants and one proband having three. Five recurrent de novo variants were also observed. Among 162 de novo variants, 13 variants (8.0%) were found in the databases in 14 probands, with an identical variant in two unrelated probands (Table 2). Specifically, two variants were registered in 54KJPN, 11 in gnomADv4.0 exome, and one in both 54KJPN and gnomADv4.0 exome, all with an allele frequency less than 0.001%. These data indicate that pathogenic de novo variants could be observed, albeit very rarely, in the large public cohort databases.
Next, we evaluated the utility of annotation based on ClinVar pathogenicity classifications. Among the SNVs and small indels identified in this study, 38.4% were registered in ClinVar with P or LP classification (Fig. 3a), which underscores the immense utility of this database. Variants unregistered in ClinVar accounted for 48.7% of the variants.
Impact on pathogenic variants by each annotation. (a) Registration status of the pathogenic variants in ClinVar. Classifications are shown with variant numbers and percentage. P/LP, pathogenic/likely pathogenic; VUS, variant of uncertain significance. (b) Relationship between ClinVar and SpliceAI annotations in 271 pathogenic single nucleotide variants (SNVs). The numbers indicate the number of variants. (c) Distribution of max score in the PhenoMatcher module for each gene. NA, Not available phenotype data.
Among 24 intronic variants, SpliceAI could predict aberrant splicing with delta score equal or above 0.2 in 22 variants (91.7%). Among 22 variants, only nine variants were registered as P or LP in ClinVar (Fig. 3b). Notably, we found four variants that were located more than 10 bp away from the exon–intron boundary and predicted aberrant splicing using SpliceAI (Table 3, and Supplementary Figure S2). Among these variants, the splicing change in WDR37, CEP290 has been confirmed in previous studies9,25. Three of four variants have been registered as P or LP in ClinVar, including a WDR37 variant, which was registered by us9.
We also evaluated the utility of a phenotype annotation tool, the PhenoMatcher module (https://github.com/liu-lab/exome_reanalysis). Approximately 95% of the candidate genes had maximum PhenoMatch scores of 0.6 or above, and 85.1% of the candidate genes had scores of 1.0 or above (Fig. 3c). Because the maximum PhenoMatch score of 0.3 was used as a threshold in a previous study23, these data suggest a good correlation between genes and phenotypes, and demonstrate the utility of prioritizing candidate genes.
In this analysis, we combined the gVCF files of probands using GLNexus. In this process, a FOXG1 variant was filtered out (Supplementary Figure S3), which was called in the gVCF. Multisample calling is recommended in GATK best-practice; however, it should be borne in mind that true but low-quality calls might be excluded in the quality filtering step.
Discussion
In this study, we found pathogenic SNVs, small indels, and CNVs in 270 of 463 probands with rare pediatric neurological diseases. Among the identified pathogenic variants, CNVs were observed in approximately 10% of the probands (Fig. 1). Intragenic CNVs were reported to account for 9.8% of the pathogenic or likely pathogenic variants identified through a panel analysis of Mendelian disease genes in a previous study26. In neurological disease cohorts, CNVs detected based on exome sequencing data accounted for 3.8%, 2%, and 1.2% of the variants in neuropathies, movement disorders, and muscle diseases, respectively27. In our cohort, the CNV detection rate for ataxia was 43%, which is higher compared with the 1% CNV rate reported among the 36 known genes associated with cerebellar ataxia28. This discrepancy may be attributed to differences in cohort characteristics, disease classification criteria, and the small samples size in the present study; however, it is noteworthy that CNVs contribute to the improved diagnostic rate of ataxia. These results confirm that exome sequencing, including CNV analysis, is useful in the genetic diagnosis of pediatric neurological diseases1,29.
We retrospectively evaluated the impact of four annotations for identifying pathogenic variants in probands with pediatric neurological diseases. To date, approximately 3 million pathogenic variants have been registered in the ClinVar database. However, 132 out of the 271 pathogenic variants in our cohort were not registered in this database. On the contrary, we also found that ClinVar annotation is of immense value, as 38.1% of the candidate variants had been registered in the ClinVar database as pathogenic or likely pathogenic. These variants could be easily identified by checking the ClinVar annotation, which reduces the burden of manual analysis. Because the ClinVar database is rapidly growing, utilizing the latest information may increase diagnostic yield. For example, the HSD17B4 c.350 A > T variant (ID: 18081) was not registered in ClinVar at the time of publication of the previous report17, but has been registered as “pathogenic” in the latest ClinVar. Because the VCF file format information in ClinVar is updated monthly, the ClinVar annotations should be regularly updated during (re-)analysis.
Notably, four intronic variants have been identified as P/LP or as a strong candidate. These variants were located between positions 11 and 50 bp away from the exon–intron boundary. SpliceAI is highly sensitive in predicting cryptic new donor or acceptor sites and the loss of canonical splice sites30. Delta scores for either splice site gain or loss were 0.95 or above in three variants, and 0.44 in one variant (Table 2), where three of the four variants being registered as P or LP in ClinVar, highlighting the usefulness of combining ClinVar and SpliceAI annotations for intronic variants. Notably, a L1CAM variant (NM_001278116.2:c.1124-24T > G) was not registered in ClinVar; thus, the SpliceAI annotation could exclusively contribute to the possible genetic diagnosis of this proband, although RNA analysis should be performed. Depending on the capture efficiency, expanding analysis region of introns beyond 50 bp from the exon–intron boundary may increase the detection of pathogenic variants in undiagnosed cases. However, our analysis showed that the number of pathogenic intronic variants decreased from 20 within 10 bp to four in the 11–50 bp range, suggesting that the further a variant is from the canonical splice site, the less likely it is to impact splicing. Additionally, as the analysis range of introns expands, the accuracy of called variants decreases3, and analysis time and cost may increase. Considering these factors, our findings suggest that extending the analysis range to 50 bp is practically useful for detecting pathogenic intronic variants in the routine pipeline of exome sequencing in combination with ClinVar and SpliceAI annotations.
We found that 13 de novo variants in 14 probands, with very low allele frequencies, were registered in large public cohort databases. Nine variants were registered as pathogenic or likely pathogenic in ClinVar, but two were classified as having conflicting classifications of pathogenicity and two were unregistered. Pathogenic heterozygous variants for dominant severe pediatric diseases might still be observed due to factors, such as incomplete penetrance, imprinting, or mosaicism1. TUBB3 variants cause fibrosis in extraocular muscles and cortical dysplasia, which have complete penetrance with a broad spectrum of phenotypes, including mild developmental delay31. Therefore, we believe that the broad disease phenotypes of TUBB3-related disorders may lead to the identification of one individual harboring the TUBB3 (c.1070 C > T) variant in 54KJPN. On the contrary, somatic mosaicism may be involved in the case of FOXG1 variants. The c.250del FOXG1 variant was registered as pathogenic with three stars in ClinVar, but was found in nine individuals in gnomADv4.1. However, the allele balance of eight variant carriers was in the 0.2–0.25 range, and one variant carrier was in the 0.25–0.3 range. Although our case also shows an allele balance of 0.33, these findings suggest that c.250del could occur as a somatic variant. Therefore, we should be mindful of the fact that very rare variants in large cohort data can be pathogenic de novo variants.
The numbers of genes responsible for Mendelian disorders is continuously increasing. Therefore, updating annotations concerning the gene–disease–phenotype associations will be essential to identify pathogenic variants in recently reported genes in exome (re)analysis32. In this study, we utilized the PhenoMacher module for prioritizing candidate genes. This program allows for the dynamic incorporation of new knowledge regarding the gene–disease–phenotype associations by updating the most informative common ancestor matrix, which can be created with three datasets (hp.obo, phenotype.hpoa, and genes_to_phenotype.txt; available from the human phenotypic ontology webpage). Therefore, by updating matrix using the three updated datasets, the risk of overlooking recently reported genes can be minimized. Although the effectiveness of PhenoMatcher in identifying the causative genes in pediatric neurological diseases has not been reported, our cohort, with 95% of the probands having a score of 0.6 or higher, could provide valuable information for determining the cutoff in pediatric neurological diseases. In practice, combining these annotations with predictions of the effects of genetic variants, such as BayesDel33, CADD34, PolyPhen-235, or REVEL36, may facilitate the identification of pathogenicity, especially for variants not annotated in ClinVar37.
The limitations of this study include the small sample size, which does not encompass the entire spectrum of pediatric neurological diseases, and the potential for selection bias considering the cohort consists only of probands collected in our laboratory. Additionally, only a limited number of annotation tools were utilized.
In summary, evaluation of the utility of the various annotation tools in identifying pathogenic variants suggests that combination of multiple annotations, such as ClinVar and SpliceAI score, can improve the diagnostic yield of rare diseases. Careful examination is required to avoid overlooking intronic and very rare de novo variants in the general populations.
Data availability
All data obtained in this study are available from the corresponding author (H.S.) upon reasonable request.
References
Lee, H. & Nelson, S. F. The frontiers of sequencing in undiagnosed neurodevelopmental diseases. Curr. Opin. Genet. Dev. 65, 76–83 (2020).
Ng, S. B. et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461(7261), 272–276 (2009).
Guo, Y. et al. Exome sequencing generates high quality data in non-target regions. BMC Genom. 13, 194 (2012).
Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176(3), 535–548e524 (2019).
Walker, L. C. et al. Using the ACMG/AMP framework to capture evidence related to predicted and observed impact on splicing: Recommendations from the ClinGen SVI splicing subgroup. Am. J. Hum. Genet. 110(7), 1046–1067 (2023).
Chen, S. et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature 625(7993), 92–100 (2024).
Tadaka, S. et al. jMorp: Japanese multi-omics reference panel update report 2023. Nucleic Acids Res. 52(D1), D622–D632 (2024).
Landrum, M. J. et al. ClinVar: Public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42(Database issue), D980–985. https://doi.org/10.1093/nar/gkt1113 (2014).
Samejima, M., Nakashima, M., Shibasaki, J., Saitsu, H. & Kato, M. Splicing variant of WDR37 in a case of neurooculocardiogenitourinary syndrome. Brain Dev. 46(3), 154–159 (2024).
Furukawa, S. et al. Two novel heterozygous variants in ATP1A3 cause movement disorders. Hum. Genome Var. 9(1), 7 (2022).
Komatsu, K., Fukumura, S., Minagawa, K., Nakashima, M. & Saitsu, H. A new case of concurrent existence of PRRT2-associated paroxysmal movement disorders with c.649dup variant and 16p11.2 microdeletion syndrome. Brain Dev. 44(7), 474–479 (2022).
Miyamoto, S. et al. Comprehensive genetic analysis confers high diagnostic yield in 16 Japanese patients with corpus callosum anomalies. J. Hum. Genet. 66(11), 1061–1068 (2021).
Miyamoto, S. et al. A boy with biallelic frameshift variants in TTC5 and brain malformation resembling tubulinopathies. J. Hum. Genet. 66(12), 1189–1192 (2021).
Miyamoto, S., Nakashima, M., Fukumura, S., Kumada, S. & Saitsu, H. An intronic GNAO1 variant leading to in-frame insertion cause movement disorder controlled by deep brain stimulation. Neurogenetics 23(2), 129–135 (2022).
Miyamoto, S. et al. A case of de novo splice site variant in SLC35A2 showing developmental delays, spastic paraplegia, and delayed myelination. Mol. Genet. Genomic Med. 7(8), e814 (2019).
Negishi, Y. et al. SCN8A-related developmental and epileptic encephalopathy with ictal asystole requiring cardiac pacemaker implantation. Brain Dev. 43(7), 804–808 (2021).
Yamamoto, A. et al. Novel HSD17B4 variants cause progressive leukodystrophy in childhood: Case report and literature review. Child. Neurol. Open. 8(x211048613), 2329048. https://doi.org/10.1177/2329048x211048613 (2021).
Watanabe, K. et al. Identification of two novel de novo TUBB variants in cases with brain malformations: Case reports and literature review. J. Hum. Genet. 66(12), 1193–1197 (2021).
Fromer, M. et al. Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am. J. Hum. Genet. 91(4), 597–607 (2012).
Uchiyama, Y. et al. Efficient detection of copy-number variations using exome data: Batch- and sex-based analyses. Hum. Mutat. 42(1), 50–65 (2021).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10(2). https://doi.org/10.1093/gigascience/giab008 (2021).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38(16), e164 (2010).
Liu, P. et al. Reanalysis of clinical exome sequencing data. N. Engl. J. Med. 380(25), 2478–2480 (2019).
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17(5), 405–424 (2015).
Tsurusaki, Y. et al. The diagnostic utility of exome sequencing in Joubert syndrome and related disorders. J. Hum. Genet. 58(2), 113–115 (2013).
Truty, R. et al. Prevalence and properties of intragenic copy-number variation in mendelian disease genes. Genet. Med. 21(1), 114–123 (2019).
Pennings, M. et al. Copy number variants from 4800 exomes contribute to ~ 7% of genetic diagnoses in movement disorders, muscle disorders and neuropathies. Eur. J. Hum. Genet. 31(6), 654–662 (2023).
Ghorbani, F. et al. Copy number variant analysis of spinocerebellar ataxia genes in a cohort of Dutch patients with cerebellar ataxia. Neurol. Genet. 9(1), e200050. https://doi.org/10.1212/NXG.0000000000200050 (2023).
Srivastava, S. et al. Meta-analysis and multidisciplinary consensus statement: Exome sequencing is a first-tier clinical diagnostic test for individuals with neurodevelopmental disorders. Genet. Med. 21(11), 2413–2421 (2019).
Barbosa, P., Savisaar, R., Carmo-Fonseca, M. & Fonseca, A. Computational prediction of human deep intronic variation. Gigascience 12. https://doi.org/10.1093/gigascience/giad085 (2022).
Poirier, K. et al. Mutations in the neuronal ss-tubulin subunit TUBB3 result in malformation of cortical development and neuronal migration defects. Hum. Mol. Genet. 19(22), 4462–4473 (2010).
Tan, N. B. et al. Evaluating systematic reanalysis of clinical genomic data in rare disease from single center experience and literature review. Mol. Genet. Genomic Med. 8(11), e1508. https://doi.org/10.1002/mgg3.1508 (2020).
Feng, B. J. PERCH: A unified framework for disease gene prioritization. Hum. Mutat. 38(3), 243–251 (2017).
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: Predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47(D1), D886–d894 (2019).
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7(4), 248–249 (2010).
Ioannidis, N. M. et al. REVEL: An ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 99(4), 877–885 (2016).
König, E., Rainer, J. & Domingues, F. S. Computational assessment of feature combinations for pathogenic variant prediction. Mol. Genet. Genom. Med. 4(4), 431–446 (2016).
Acknowledgements
We would like to thank the patients for participating in this study and the attending physicians for referring clinical data. This work was supported in part by the Japan Society for the Promotion of Science, KAKENHI (Grant number JP20H03641 and JP23H02875) (H.S.), the Japan Agency for Medical Research and Development (AMED) (JP23ek0109549, JP23ek0109674, and JP23ek0109637) (H.S.), the Takeda Science Foundation, and HUSM Grant-in-Aid from Hamamatsu University School of Medicine (M.N. and H.S.).
Author information
Authors and Affiliations
Contributions
H.S.: conceptualization. M.N.: genetic data curation. M.K., K.Ku., S.F., K.Y., I.H., and S.K.: recruitment of patients and their families and phenotype collection. T.H. and K.Ko.: classification of the proband phenotype as pediatrician and pediatric neurologist. K.Ko. and H.S.: genetic data curation and writing of the original draft. All authors: writing, review, and editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Komatsu, K., Kato, M., Kubota, K. et al. Identifying pathogenic variants in rare pediatric neurological diseases using exome sequencing. Sci Rep 14, 24746 (2024). https://doi.org/10.1038/s41598-024-75020-0
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-024-75020-0





