Introduction

Hearing loss is one of the most common pathologies of the human auditory system. Genetics remains the biggest contributor to prelingual hearing loss, accounting for 50% of all cases1. Hearing loss can be classified according to different criteria including the degree of the loss, age of onset and associated structures involved in pathogenesis2. The degree of hearing loss ranges from mild to profound according to the World Health Organization (WHO) classification. Extensive research has been carried out to characterize the molecular genetic makeup of sporadic individuals born in different populations worldwide. The contributing genes vary greatly in the individuals lacking pathogenic GJB2 variants; for example variants in MYO7A contribute to the majority of cases in the Palestinian population3. In Switzerland, STRC variants are major contributors to hearing loss in individuals with no relevant history of this phenotype4. A study of Belgian children revealed that variants in MYO15A are the major contributor to sporadic hearing loss5.

In Pakistan, studies exploring the genetics of hearing loss have focused mainly on consanguineous families with multiple individuals affected with severe to profound deafness. Most of the variants in these individuals are identified in a subset of genes including SLC26A4, GJB2, MYO7A, HGF, TMC1and TMPRSS36 whereas variants in GJB2, MYO15A, OTOF, SLC26A4, TMC1, and TMPRSS3 account for maximal number of cases of moderate to severe hearing loss in this population7. On the other hand, comparatively fewer comprehensive studies have reported findings from genetic analysis of sporadic hearing loss in Pakistan8. Most related research has evaluated the contributions of GJB2 pathogenic variants and has revealed that GJB2-related hearing loss has a similar incidence rate in participants from different regions of the country. In a cohort of 86 individuals affected with moderate to severe hearing loss from the Punjab province, the percentage contribution of GJB2 variants was 4.65%9. The percentage contribution of GJB2 variants in a cohort of 70 severe to profoundly deaf participants from the Hazara division was 4.28%10 while the prevalence of GJB2-related deafness was 6% in 150 participants recruited from Sindh11.

There are only two comprehensive studies employing next-generation sequencing to characterize the genetic spectrum of hereditary hearing loss in individuals from Pakistan with no history of hearing loss. The first study was carried out on 40 profoundly deaf children from the Khyber Pakhtunkhwa province. Unlike other studies in Pakistan, GJB2 was found to be the major contributor in the cohort accounting for up to 38% of hearing loss followed by SLC26A4 with a 12.5% contribution12. The second study explored the genetic spectrum of moderate to severe hearing loss in 21 unrelated affected individuals from the Punjab province. In this group, OTOF variants were identified as the major contributors followed by variants in GJB2, BSND, SLC26A4 and OTOA which contributed equally8. Both studies revealed high genetic heterogeneity, yet the percentage contribution of each gene varied greatly between the two groups.

Here, we extend these findings by reporting the results of molecular characterization of an additional forty-four individuals born to consanguineous couples with no previous family history of hearing loss.

Results

Hearing loss is predominantly prelingual in participants

Audiometry data showed that 25 out of 44 individuals had a severe degree of hearing loss (77db HL–88.75db HL). Fourteen participants exhibited moderately severe hearing loss (57.5db HL–70db HL) while two individuals were affected with moderate degree (45dB HL–61.25dB HL) of hearing loss. Hearing loss in four individuals (HLMS02, HLMS06, SPK3 and HLMS40) was reported to have occurred early but with progressive worsening and it ranged from moderate to profound at the time of recruitment (Fig. 1). All participants, at the ages ranging from 6 to 17 years, showed no obvious symptoms other than hearing loss at the time of recruitment.

Fig. 1
figure 1

Audiograms representative of the degrees of hearing losses observed in the participants. Audiometry was performed in ambient noise conditions. Circles indicate thresholds from the right ears while crosses represent values from that of the left ears. Individuals SPK9 and HLRBS1 had asymmetric hearing losses which were classified as moderately severe and severe, respectively. Participant HLMS40 had a profound hearing loss which was reported to have developed progressively.

Variants in a large number of known genes cause hearing loss in the cohort

Good quality exome data were obtained for all participants with a mean depth of coverage of 100 X (> 10 X = 99·2%). Copy number variant analyses did not uncover any potential pathogenic alleles. Exome data analyses identified variants in 17 deafness genes which could be correlated to the phenotype without any ambiguity (Table 1, Extended Supplementary Data). However, four participants had variants in multiple deafness genes (Table 1, Extended Supplementary Data). The majority of the individuals had biallelic homozygous variants while a few participants exhibited monoallelic variants in these genes. No deleterious homozygous, compound heterozygous or heterozygous variants in known deafness genes were identified for eleven participants (Table 1, Extended Supplementary Data). However, for two of these eleven participants, we shortlisted deleterious variants in two genes that currently have no described roles in causing deafness (Table 1, Extended Supplementary Data).

Table 1 Information regarding the participants and the genetic findings.

Variants in most of the genes were missense, followed by frameshift, nonsense and those affecting splice sites, and an in-frame insertion. The majority of the same genes characterized in a previous study on moderate to severe hearing loss segregating in familial cases13, were identified in the present study as well. Additionally, a few genes were implicated for the first time in prelingual, severe hearing loss unlike in previous descriptions. Among them is TOR1AIP1, variants of which have been documented to cause profound deafness14. Also, COL4A3 variants have been previously reported to cause mild hearing loss with onset in late childhood, which is usually progressive in nature15 (Table 1). All new clearly pathogenic variants have been submitted to LOVD (https://www.lovd.nl/) with accession numbers between 00448500 and 00448507. Those variants in known deafness genes which were not unambiguously linked to the phenotypes, as well as those in novel genes were not deposited.

The range of CADD scores obtained for both novel homozygous and heterozygous missense, nonsense and splice site variants was found to be between 22.7 and 37 (Table 1). The newly identified missense variants affected amino acids that were highly conserved throughout evolution (Fig. 2) with few exceptions as detailed below.

Fig. 2
figure 2

Clustal Omega alignments of protein sequences from diverse vertebrates. (AD) Alignments of amino acids affected by novel homozygous missense variants show conservation among representative vertebrate species. All missense variants affected completely conserved amino acids except for two. In PEX1, the Pro720 residue was substituted with serine in three out of 100 vertebrate species and asparagine in orangutan. (Sequence from only Podacaris muralis is shown here). LHFPL5 Ala178 is replaced by serine in southern platyfish assessed by https://genome-euro.ucsc.edu, Multiz Alignment of 100 vertebrates. (EJ) Alignments of residues affected by novel heterozygous missense variants. All, except two variants, affected completely conserved amino acids. The Met440 residue of COCH in a few fish species was replaced by tryptophan. 286Ala in ATP6V1B2 had threonine as the wild-type amino acid in wallaby (assessed by https://genome-euro.ucsc.edu, Multiz Alignment of 100 vertebrates).

Biallelic variants identified in known deafness genes

Biallelic variants known to cause hearing loss were identified in 13 genes. New variants were identified for 7 individuals while 17 participants had previously reported variants (Table 1). Among these, SLC26A4 variants contributed to the majority of individuals’ phenotypes since the hearing loss of 11/44 participants could be attributed to this gene. Seven of them had the same homozygous missense variant c.1337A>G; p.(Gln446Arg). Importantly, the individuals affected with SLC26A4 variants mostly presented severe hearing loss. An exception was observed in the individual affected with the p.(Ser57Ter) variant whose hearing loss was reported to have been moderate at onset and had progressed to profound deafness later in life.

For the participant HLRBS1, exome data analysis yielded a homozygous missense variant c.2036G>A; p.(Cys679Tyr) in HGF. This variant was predicted to be deleterious by multiple software tools with high CADD and REVEL scores (Table 1). Previously, only variants in the 3´UTR of HGF and a synonymous variant were reported to result in non-syndromic autosomal recessive deafness16.

A homozygous frameshift variant c.673dupC; p.(Gln225fs) in TOR1AIP1 was identified in a participant. At the time of enrollment, the six-year old child presented bilateral sensorineural severe hearing loss without the presence of accompanying syndromic features. The described phenotype for patients with TOR1AIP1 variants is muscular dystrophy, autosomal recessive, with rigid spine and distal joint contractures (OMIM# 617072). Some patients also have deafness. This female child also harbored two heterozygous variants in MYO15A. Both variants had deleterious REVEL and high CADD scores (Table 1). However, the two affected amino acids were not conserved during evolution (https://genome-euro.ucsc.edu, Multiz Alignment of 100 vertebrates). In addition, one of the variants had conflicting entries in ClinVar, among which one was benign (ID VCV000500192.15) while the other variant was also deposited with unknown significance (ID VCV000438713.8). Thus, any role of these two MYO15A variants in contributing to hearing loss remains unclear.

Heterozygous variants observed in syndromic and non-syndromic hearing loss genes

The analyses of the data did not reveal any potential homozygous or compound heterozygous variants that could be considered pathogenic for five participants while for two participants both heterozygous and homozygous variants were found (see below). Due to the absence of any history of hearing loss in the participants born to unaffected parents, heterozygous variants were evaluated under the hypothesis of a de novo origin of a pathogenic allele or incomplete penetrance. The affected individuals with heterozygous variants in syndromic deafness genes, manifested only hearing loss at the time of the recruitment. Evaluation of any accompanying symptoms that may have appeared at an older age remains unknown since we no longer have access to the participants.

A heterozygous splice site variant c.7164 + 1G>A in CHD7 was correlated with the moderate degree of hearing loss for the participant HLMS33 (Table 1). The variant was absent from all public databases and was previously reported to cause mild CHARGE syndrome17. However, she was also homozygous for a variant of unknown significance in OTOA c.920C>T, p.(Ala307Val) (Extended Supplementary Data). This variant had very low deleterious scores and the amino acid was also not conserved in evolution (https://genome-euro.ucsc.edu, Multiz Alignment of 100 vertebrates). Heterozygous missense variants of unknown significance were identified in two genes, COCH and ATP6V1B2 for the participant HLMS09 (Table 1). The amino acid affected by COCH variant, p.Met440, is not conserved in fish (https://genome-euro.ucsc.edu, Multiz Alignment of 100 vertebrates). It is thus unclear whether the detected COCH variant contributes to the hearing loss of our participant. In contrast, the ATP6V1B2 p.(Ala286Ser) variant had higher predictions to be damaging (Table 1) and affected a conserved amino acid (Fig. 2). However, the participant exhibited only hearing loss at a young age which suggests that either this ATP6V1B2 variant causes an atypical disorder or other associated phenotypes remained to be manifested. We also identified novel missense heterozygous variants of EYA1 and COL4A3 for two participants, HLMS15 and IPK1 respectively, which may explain their hearing loss (Table 1). Similarly, these young participants (6 years old) had not manifested syndromic features at the time of the recruitment.

For another participant HPK7, two heterozygous missense variants affecting conserved residues in NCOA3 and MYO7A were shortlisted (Table 1). The MYO7A variant was exceedingly rare while the NCOA3 variant was absent from all public databases. It is possible that the MYO7A variant may be deleterious only when a second pathogenic allele is present, since recessive inheritance is much more common for this gene and because the variant of an adjacent residue to this change also causes recessively inherited deafness18. A heterozygous missense variant of unknown significance in SLC12A2 was identified for a participant who had a prelingual severe hearing loss. Variants of SLC12A2 are known to cause congenital severe to profound deafness as described in families from Japan19, Pakistan and Ghana20 and the phenotype is designated as deafness, autosomal dominant 78 (OMIM 619081). Again, it is possible that the hearing loss may be due to other reasons since SLC12A2 variants involved in dominantly inherited deafness lie in the exon 21 or its 3ʹ splice site affecting the carboxy terminus domain19. The variant in the participant of our study was located within exon 4 which encodes the AA_permease domain.

A participant HLMS03 with prelingual moderately severe degree of hearing loss harbored a heterozygous missense variant in MYH14 c.2161C>T; p.(Arg721Cys). The variant was scored deleterious by various software tools including REVEL (0.68) and SIFT (0). The variant was rare in public databases and had a considerably high CADD score of 29.8. This variant has been previously reported to cause a complex phenotype involving late-onset hearing loss (OMIM 614369). Variants of MYH14 are also known to cause deafness, autosomal dominant 4A (OMIM 600652) with a progressive phenotype that starts during the second decade of life. However, the same individual HLMS03 also had a homozygous frameshift variant, c.120_121delTT in CA5B. This variant was absent from the public databases. The gene CA5B has no known correlation with hearing loss. Furthermore, the probability of loss-of-function intolerance (pLI) score of CA5B is 0 which indicates tolerance to loss-of-function alleles. However, as noted below, such scores cannot be solely used to correlate or exclude candidate genes for recessive disorders21.

Biallelic variants in potential candidate genes

No homozygous or heterozygous variants predicted to be deleterious affecting known deafness genes were found for the remaining 13 participants. However, homozygous variants that were highly damaging were identified for two participants in potential candidate genes. The two variants were absent from the exome data of 300 unrelated individuals and were either absent or exceedingly rare in public databases with no homozygotes. To date no phenotypic matches have been obtained after the gene names were entered into GeneMatcher (https://genematcher.org/).

A splice site variant c.1387 + 2 T>C in EIF5B was considered to potentially cause hearing loss of a participant. The variant was highly deleterious with SpliceAI and dbscSNV Ada scores of 0.86 and 0.99 respectively, while the CADD score was 34. The variant was rare in the South Asian population with an allele frequency of 0.0002961. The pLI score for EIF5B is 1, indicating high intolerance of the gene to loss of function variants22. For individual HLMS04 who had a prelingual moderately severe hearing loss, a homozygous frameshift variant c.290dupA; p.(Gln98fs) was found in FAM78B. The pLI score for FAM78B is close to 0; (0.31) which suggests that loss-of-function variants in this gene may be tolerated. However, it is pertinent to note that many bona fide deafness genes, such as GJB2 also have pLI scores of, or approaching 0 which indicates complete tolerance to loss-of-function variants. pLI score are thus not sufficient for supporting or refuting candidatures of genes associated with autosomal recessive disorders21. Moreover, there are no individuals homozygous for any FAM78B loss-of-function variant (such as frameshifting, nonsense or canonical splice site variants) in the updated gnomAD v4.1.0 database.

The novel missense variants affect conserved residues and the changes are predicted to alter the protein structures

Clustal Omega alignments revealed that the respective wild-type amino acids are conserved in orthologs from all vertebrate species (Fig. 2 and https://genome-euro.ucsc.edu, Multiz Alignment of 100 vertebrates) with a few exceptions. The proline residue affected by the variant Pro720Leu in PEX1 was not conserved in 4 of 100 species in which the proline residue was substituted by serine in three and asparagine in orangutan (Fig. 2 and https://genome-euro.ucsc.edu, Multiz Alignment of 100 vertebrates). Similarly, LHFPL5 Ala178 was conserved in all species except southern platyfish in which the amino acid is serine. The Met440 amino acid in COCH was replaced by tryptophan in a few fish species while the Ala286 residue of ATP6V1B2 was replaced by threonine in wallaby (https://genome-euro.ucsc.edu, Multiz Alignment of 100 vertebrates).

Protein modeling data of these missense variants suggested that the replacement of the respective wild-type amino acid with a variant amino acid can potentially disrupt these protein structures (Supplementary Fig. 1). The replaced amino acids were predicted to form new bonds with other amino acids such as those of the MYO7A p.(Gln1250Arg) and PEX1 p.(Pro720Leu), COCH p.(Met440Thr) and ATP6V1B2 p.(Ala286Ser) variants. A change in distance of polar bonds with neighboring amino acids is anticipated due to the EYA1 p.(Gly423Ala423) and SLC12A2 p.(Ile341Val) variants. A substitution reduced the existing bond length and led to the formation of a new bond within the vicinity, as observed for the HGF p.(Cys679Tyr) variant. Disruption of partial or complete polar bonds was predicted at the point of substitution for the NCOA3 p.(Asp1065Val) and USH1G p.(Ala8Pro) variants, respectively.

Discussion

A large number of participants in the present study were affected by variants of SLC26A4. The allele p.(Gln446Arg) was the primary contributor detected among those individuals who had SLC26A4 variants. Data from Pakistani multiplex families with moderate to severe hearing loss has revealed a similarly high proportion of hearing loss due to the p.(Gln446Arg) allele, with all variants of SLC24A4 contributing to 14.3% of deafness13. The frequency of SLC26A4-related deafness in the current cohort was 25% (Fig. 3A) and was calculated to be 20% (Fig. 3B) when combined with the data from our previous report on sporadic hearing loss8. The contribution of SLC26A4 variants in individuals with no history of hearing loss is ~ 11% greater in our cohort than that observed in families with multiple individuals affected by hearing loss7.

Fig. 3
figure 3

Percentage of known deafness genes’ contribution in the cohort. (A) Pie chart showing the percentage contribution of each gene to hearing loss in our study. Percentages were calculated from the data of 44 individuals. SLC26A4 caused hearing loss in 25% of the participants. 25% participants remained without a genetic diagnosis. The percentage undiagnosed includes the two individuals with variants in potential candidate genes. (B) Pie chart with the recalculated percentage contribution of each gene to hearing loss after combining the results from 21 participants with a similar phenotype from our previous study8. The final participant count was 65 and the major contributing gene remains SLC26A4. (A) Percentage of gene contribution. (B) Percentage contribution after combining the data from previous report.

Variants of many other genes known to cause hearing loss were identified, most of which were biallelic. Variants of all genes, excluding PEX1 and TOR1AIP1 are frequently observed in cohort studies involving families with individuals exhibiting non-syndromic autosomal recessive hearing loss13,23. A striking finding was that no participant was affected by pathogenic variants of TMC1 or TMPRSS3. Pathogenic alleles of these two genes are among the top contributors in large families with multiple individuals having moderate to severe hearing loss7 as well as those with severe to profound deafness in which they account for 6.4% and 4.3% respectively, of hearing loss23.

Among the novel biallelic variants, a missense variant in HGF contributed to hearing loss in one participant. The p.(Cys679Tyr) allele is the first coding variant of HGF to be implicated in hearing loss. Hepatocyte growth factor (HGF) has a signaling role in stria vascularis of the inner ear where it maintains the endolymph homeostasis24. The variant position Cys679 lies in the S1 serine protease domain in close proximity to pseudo-catalytic triad residues including Gln534, Asp578 and Tyr673. The region helps in binding to the MET receptor beta-propeller domain, leading to MET dimerization and signaling. Val495 in HGF forms a salt bridge with the amino acid Asp672 in the activation pocket. This interaction allosterically influences the MET binding site in HGF. Variants disrupting this electrostatic interaction result in reduced HGF-MET binding and compromised downstream signaling25,26. Therefore, the Cys679 variant affects a critical region of HGF.

A seven-year old participant in the study had a p.(Pro720Leu) variant in PEX1 and presented hearing loss only at the time of recruitment. The participant was also a carrier of GJB2 pathogenic nonsense variant (Supplementary table 1). Diseases caused by pathogenic variants in PEX1 include Heimler syndrome, Peroxisome biogenesis disorder 1A (Zellweger syndrome) and Peroxisome biogenesis disorder 1B. The milder form of Zellweger spectrum syndrome manifests as sensorineural hearing loss and retinitis pigmentosa that can be confused with Usher syndrome27. A similar case of mild Zellweger syndrome was detected in a Polish patient who presented early hearing loss before the age of five years while other physical symptoms including spastic paresis, bilateral cataracts and leukodystrophy developed later in life28. Another female patient was initially diagnosed with Usher syndrome at the age of 5 years. Subsequent exome sequencing revealed the presence of compound heterozygous variants in PEX1. Follow-up examination demonstrated that she had mild symptoms of Zellweger disease, such as permanent tooth enamel dysplasia, nail leukoplakia, and biochemical irregularities within the peroxisome29.

A TOR1AIP1 homozygous frameshift variant was considered as a likely candidate for causing congenital sensorineural hearing loss in a 6-year old affected female participant in our study. The variant was located in exon 5 of a gene with a total of ten exons and is therefore likely to result in nonsense-mediated decay of the mutant transcript. Despite the severity of the variant, the participant did not exhibit any symptoms other than hearing loss at the time of the recruitment. Phenotypes due to pathogenic variants in TOR1AIP1 are diverse and include heart defects, muscular defects and multi-systemic disorders (OMIM 617072). Symptoms can include deafness in some patients30. Interestingly, the participant also possessed two heterozygous missense variants in MYO15A which were predicted to be deleterious by various software tools. However, these variants impacted amino acids that were not conserved in some vertebrate orthologs (being substituted by non-conservative amino acids in reptiles, amphibians and fish). It is possible that the hearing loss in the affected individual stems from either the variant/s in TOR1A1P1 or MYO15A or from a combination of both.

In six participants, predicted deleterious monoallelic variants were identified among which five affected genes are known to cause syndromic deafness. However, only hearing loss was apparent at the time of enrollment in these participants. Misdiagnosis as nonsyndromic remains a possibility in these instances since due to the young age of these individuals, other late-onset phenotypes may not have developed yet. Heterozygous variants of ATP6V1B2,CHD7, COL4A3 and EYA1 have been reported to cause DDOD31, CHARGE syndrome32, Alport syndrome33 and Branchiootorenal spectrum disorder (BOSRD)34, respectively. A heterozygous missense variant in COCH was also identified in the participant with the ATP6V1B2 variant and it is unclear which of the gene variant is responsible for the phenotype. The identified variant c.7164 + 1G>A in CHD7 is already known to cause mild atypical CHARGE syndrome in a patient with predominantly bilateral profound deafness17. Similarly, other symptoms due to COL4A3 and EYA1 variants may not manifest at an early age. For example, patients with dominantly inherited Alport syndrome such as those with COL4A3 variants exhibit a milder phenotype than patients with recessive disease and the renal phenotype is mostly adult onset. Patients with EYA1 variants are also variably affected and end-stage renal failure is observed in only some cases and usually occurs later in life35.

A male individual with moderate to severe degree hearing loss harbored a heterozygous missense variant in MYH14 and a homozygous frameshift variant in CA5B. This MYH14 variant was previously reported in one patient with bronchopulmonary dysplasia and one patient with a late-onset complex disease characterized by myopathy, neuropathy and deafness (reported as chr19:50760672, NM_024729.3:c.2038C>T, p.Arg680Cys)36. The late onset of the hearing loss phenotype observed in the previously described patient does not match the prelingual severe hearing loss observed in individual HLMS03 in the present study. Variants of CA5B, a carbonic hydrase, have not been associated with any disorder. Transcript analysis in mouse inner ear has shown that Ca5b has a unique spatial and temporal expression during inner ear development37. Given that the same MYH14 variant has been described in patients with or without hearing loss, and the lack of CA5B correlation to any disease, it is unclear whether MYH14 or CA5B variants contribute to the phenotype of the affected individual.

A few participants had variants in genes that are associated with dominantly inherited nonsyndromic hearing loss. Among them, predicted deleterious, heterozygous missense variants in MYO7A and NCOA3 were identified for one participant. Moreover, the MYO7A variant affects the penultimate nucleotide adjacent to the donor site and can alter exon splicing. Different variants of MYO7A have been correlated with deafness, autosomal dominant 11, (OMIM 601317) in several families with multiple affected individuals and many more are known to cause recessively inherited Usher syndrome type 1b (OMIM 276900). However, to date, only two variants of NCOA3 have been described in two multigenerational families to cause nonsyndromic progressive hearing loss38,39. A zebrafish ncoa3 knockout model showed abnormalities in the inner ear, suggesting progressive hearing impairment in adult fish39. Analyses of the two previously described pathogenic p.(Ser937Cys) and p.(Gly970Ala) NCOA3 variants indicate that the predictions to affect protein function are benign and the affected amino acids are not conserved among diverse vertebrate groups. Given this lack of robust evidence for the involvement of NCOA3 in human hearing loss, it is possible that the deleterious variant we detected in this gene was not pathogenic. Additionally, since most MYO7A variants cause deafness only when inherited with a second mutant allele, the monoallelic variant we detected may not have contributed to our participant’s hearing loss as well. Future descriptions of either variant in other affected individuals will clarify this issue.

For eleven individuals, etiology of the hearing loss could not be attributed to deleterious variants in known deafness genes. However, in individual HLRBS16, a variant affecting a canonical donor splice site in EIF5B was a strong candidate pathogenic allele. The variant affects the donor splice site of the seventh exon in a gene with 24 exons. Among other effects, such variants can result in exon skipping, intron retention, or activation of an alternative cryptic splice site within an exon or an intron40. EIF5B catalyzes the joining of the large ribosomal subunit (60S) with the small ribosomal subunit (40S) to form the 80S ribosome via its GTPase activity during protein synthesis in eukaryotic cells41. gEAR42 (https://umgear.org/index.html) and SHIELD43 (https://shield.hms.harvard.edu) analyses revealed the expression of Eif5b in mouse inner and outer hair cells as well as in other cells.

Individual HLMS04 had a homozygous frameshift variant in FAM78B which was predicted to truncate the protein at the carboxy terminus. Most variants in FAM78B described so far are of uncertain significance and sufficient data are missing to support its involvement in a specific disease44. Fam78b is expressed in both embryonic and postnatal hair cells, as well as in cochlear hair cells and utricular hair cells in mice45. The expression pattern and pathogenic variant in a patient with hearing loss make FAM78B a potential candidate for having a role in auditory system.

The exome analyses of singleton individuals affected with bilateral hearing loss has revealed similar genetic heterogeneity as that observed in cohorts of large multiplex families13. The percentage of individuals without diagnosis in our cohort was 25% which was slightly lower as compared to that observed in genetic studies on large multiplex families from Pakistan, in which undiagnosed sample rate was 28%13. Although the lack of parental samples prevents the validation by segregation of novel gene variants in the participating families; highly deleterious nature of the two variants, lack of these variants in control samples and absence or extremely low population frequencies provide support for additional research to explore potential candidature of novel genes in hearing.

The study therefore highlights the importance of the molecular characterization of sporadic cases which could lead to the identification of novel genes and gene variants. Further investigations involving functional studies and localization assays of potential candidate genes will provide additional insights and clarify the precise mechanisms and functions of their corresponding proteins in the inner ear.

Methodology

Ascertainment and recruitment of participants

Ethical approval for the study was obtained from the Institutional Review Board (IRB), School of Biological Sciences, University of the Punjab, Lahore, Pakistan (IRB # 00005281). Singletons born to couples with consanguineous marriages were ascertained by visiting special education schools in the Punjab province. All parents were unaffected. The affected participants were recruited for the study based on their hearing thresholds. Individuals with moderate to severe degrees of hearing loss were preferred while those with profound deafness were only recruited if they were reported to have had better hearing at younger ages and worsening was observed progressively. Pure tone thresholds were measured via audiometry under ambient noise conditions using a portable audiometer (DANPLEX DA65) at frequencies of 0.25, 0.5, 1, 2, 4 and 8 kHz. The air-conduction Pure Tone Average (PTA) was calculated by taking the average of pure tone thresholds at 0.5, 1, 2, and 4 kHz. The degree of hearing loss was assigned according to the thresholds of the better hearing ear. Hearing loss ranging from 26 to 40 dB was considered mild, 41–55 dB was classified as moderate, 56–70 dB was designated moderately severe, 71–91 dB was considered severe and above 91 dB was classified as profound46. Detailed questioning about medical history, age of onset, use of ototoxic drugs, trauma, undue noise exposure and any accompanying features was completed to rule out the possibility of environmentally induced deafness or accompanying syndromic features. Blood samples, ranging from 5 to 10 ml, were collected in sterile vacutainer EDTA tubes.

Molecular characterization

DNA was extracted from whole-blood using a standard protocol. Cells were lysed and centrifuged after treatment with equal volume of solution (0.32 M sucrose, 10 mM Tris HCl, 5 mM MgCl2 and 1% Triton X-100; pH7.6) diluted 1:2 with water. The supernatant was discarded after the centrifugation and the pellets were washed once or twice with 1:3 diluted lysis solution. The whitish pellets were re-suspended in a mixture of a buffer (20 mM Tris–HCl pH 8.0, 100 mM NaCl, 4 mM EDTA; pH7.4) and 10% SDS as well as 20 mg/ml Proteinase K were added. The samples were incubated at 45°C in a water bath overnight. Next day, the digested proteins in the solution were precipitated using a saturated NaCl solution and the pellets were removed after centrifugation. The DNA was precipitated from the supernatant using an equal volume of ice-cold isopropanol and then washed with 70% ethanol47,48. The extracted DNA pellets were dried at room temperature and re-suspended in 100–300 μL of low T.E buffer (10 mM Tris–HCl, 0.2 mM EDTA).

The samples of most of the participants had been previously screened for variants in GJB2 by Sanger sequencing of exon 29 while a few were analyzed later8. Only samples negative for GJB2 were subjected to whole-exome sequencing (3billion, Seoul, South Korea). Sequencing was performed using NovaSeq platform (Illumina NovaSeq 6000, San Diego, CA) following exome capture via xGen Exome Research Panel v2 (Integrated DNA Technologies, Coralville, Iowa, USA).

For small nucleotide variation analyses, variants were annotated, filtered, and prioritized using two tools. EVIDENCE, a 3billion internally developed tool consisting of a daily updated database, customized variant classification, and symptom similarity49 and Franklin software (https://franklin.genoox.com/clinical-db/home), a freely available online tool which annotates the exome data and allows filtering to be performed after specification by the user (please see below), were used. At 3billion, variants with an allele frequency > 5% in gnomAD v2.1.1 (https://gnomad.broadinstitute.org/) were removed except for those previously reported as pathogenic or likely pathogenic at least once in ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/). Variants were then classified according to the ACMG and the American Molecular Pathology guidelines. Symptom similarity was measured for each variant between the participant’s symptoms and known symptoms associated with the disease by the 3billion developed algorithm. The final list of rare variants was manually reviewed to select reportable variants. All reportable variants were examined using the Integrative Genomics Viewer (IGV, version 2.9.4). To detect large genomic deletions or insertions, copy number variant analyses were performed using CoNIFER 0.2.250 and 3billion developed tool 3bCNV which uses depth of coverage information.

For data examination using Franklin, all exonic variants and those affecting splice sites up to ± 10 regions were retained. Data were filtered and variants with frequencies of less than 0.01 in the public databases including gnomAD v2.1.1 were assessed. The frequencies of the variants were also observed in the exome data of 300 unrelated ethnically matched in-house samples and those present in > 1.7% individuals were removed. Pathogenicity predictions by multiple software programs such as REVEL51 (https://sites.google.com/site/revelgenomics/), SIFT52 (https://sift.bii.a-star.edu.sg/) and PolyPhen253 (http://genetics.bwh.harvard.edu/pph2/) were considered. Combined Annotation-Dependent Depletion (CADD)54 scores were assessed for missense, nonsense and splicing variants (https://cadd.gs.washington.edu/). For synonymous, missense and splicing variants, SpliceAI55 (https://spliceailookup.broadinstitute.org/) and dbscSNV Ada56 predictions were also considered. Franklin database, literature and OMIM (https://www.omim.org/) were consulted for identifying previous description of genes with various disorders.

The conservation of amino acids affected by missense variants was analyzed by accessing the Multiz Alignment of 100 Vertebrates in the UCSC genome browser (https://genome.ucsc.edu/). Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/) was used for the alignment of protein sequences from diverse species. Effects of missense variants on the substituted amino acids were modeled by using open access PyMOL software (https://pymol.org/2/) (The PyMOL Molecular Graphics System, Version 3.0 Schrödinger, LLC). The publically available AlphaFold structures57 of the proteins were retrieved from the UniProt (https://www.uniprot.org/). The wild-type amino acid was replaced with the mutant amino acid by using the mutagenesis option in PyMOL. The polar contacts of the required amino acids were measured for both the wild-type and the mutant proteins. The expression of the novel gene candidates in inner ear cells was assessed from the data in Shared Harvard Inner-Ear Laboratory Database (SHIELD)43 (https://shield.hms.harvard.edu/) and the gEAR42 portal (https://umgear.org/index.html).

Calculation of percentage contributions

The percentage contribution of deafness genes to hearing loss was calculated for our cohort of 44 participants separately as well as after including the data of 21 sporadic individuals with moderate to severe hearing loss from our previous study8.