Introduction

Cochlear implants (CIs) are one of the most successful medical devices helping well over a million recipients across the globe1,2. CIs are the intervention of choice for many severe-to-profoundly deaf children and adults who do not receive adequate benefit from hearing aids. Most individuals with CIs achieve high levels of speech understanding in quiet (often 100% scores for sentences) and hear better than using hearing aids alone3,4,5,6. Despite the great success, there is still a large variability in outcomes, which is affected by many factors such as age of implantation, years of hearing deprivation, or the listening experience with the device2,7,8,9,10. Many researchers suggest that the viability of the electrode-neural interface (ENI, the interface between the CI electrode array and the auditory nerve) is one of the most important factors underpinning CI performance and it needs to be better understood11,12,13. Modelling work has shown that the largest loss of information transmission happens at the ENI14. Potentially by accurately assessing an individual’s ENI status, a personalised intervention plan aimed at delivering better performance could be formed. This systematic review initially outlines the main ENI measures for informing CI re-mapping approaches and evaluates research where these experimental ENI interventions are compared to standard care. Subsequently, we conducted a meta-analysis to estimate the clinical efficacy of such interventions.

Electrode-neural interface

In CI hearing, the electrode array is typically inserted into the scala tympani to replace the role of the inner hair cells (IHC). Instead of the trans-synapse interaction initiated by the neural transmitters, a pair of electrodes is activated to drive local potential changes and hence currents initiating the action potentials along the dendrites of the stimulated spiral ganglion cells (SGC). This crucial interface between the CI electrode and the auditory nerve is often termed the Electrode-Neural Interface (ENI).

To represent a sound, a conventional CI mainly performs two essential tasks per timeframe, coding the frequency and intensity of the sound. Over the timeframes, the temporal variations in sounds are represented. Many aspects of the ENI can affect the quality of these two essential tasks. To code the frequency spectrally (it could also be done temporally but with a more limited range), different SGCs need to be stimulated depending on the frequency of the sound. However, if the placement of the electrode array is subpar and/or the impedance is elevated by the fibrotic tissues, it is difficult to achieve precise stimulations. To code the intensity, sufficient SGCs need to be stimulated simultaneously. If the SGCs at the designated region are degenerated and/or lost, this task could be particularly difficult. Figure 1 demonstrates commonly discussed topics in ENI. To further understand the interactions between these factors of ENI, researchers proposed several important attributes alongside methods to measure them.

Fig. 1
Fig. 1The alternative text for this image may have been generated using AI.
Full size image

Common topics explored that affect the quality of the electrode-neural interface. The four sections marked by the rectangles, from top to down, represent the electrode array in the Perilymph, the soft tissues, the bone tissues, and the SGCs in the modiolus, respectively. Each filled circle represents an electrode contact of the array. Each neurone icon represents a population of SGCs. The colour of each icon marks the CF of the population. (a) The electrical impedance is determined by the resistance of the path between the active and ground electrode passing through the stimulated SGCs. (b) Fibrous tissue and newly formed bony tissue, which are much less conductive, occupy space in cochlea instead of the conductive fluids increases electrical resistance. (c) Top represents further away from the modiolus. (d) Layer of SGCs represent the integrity of them. (e) A frequency range without any functioning SGCs represents a dead region. (f) Left side represents the apical side and lower CF. (g) Left side represents a deeper insertion depth. (h) Scala location (scala-media, scala-tympani or scala-vestibuli) of the electrode array. Note that scala-media is filled with Endolympth not Perilymph.

Channel independence

To be able to separate out different spectral regions, different populations of SGCs should be stimulated when different CI channels are activated. For CI users, speech perception improves with the number of independent channels up to seven or eight channels for the older generations15,16 and sometimes even more channels up to twenty for the newer models and recipients with better residual hearing prior to implantation17,18,19. Unfortunately, this can be challenging for CI hearing since the electrode array is surrounded by the highly conductive perilymph in the cochlea. The spread of excitation reduces channel independence between adjacent electrodes; cross-talk between spatially separate electrodes makes stimulation patterns difficult to distinguish despite different input signals20,21. Bierer11 described two main causes for increased spread of excitation, obstructed current path and electrode position (see Fig. 1). Cochlear implantation causes inflammatory responses in the soft and hard tissues in cochlea. Immune cells release cytokines and growth factors, which spread easily in the fluid-filled chamber, that signal fibroblasts to activate22. An analysis of 20 implanted human cochleae using 3D reconstructions reported that about 25% of cochlea’s volume was occupied by fibrous tissue and another 7% was occupied by newly formed bony tissue23. The fibrotic tissue encapsulates the electrode array increasing the resistance between the active and ground electrodes, resulting in the need for a higher potential difference (or current level, following the CI fitting software convention) to achieve sufficient auditory nerve stimulation24. The impedance typically increases shortly after implantation24,25 and stabilise over the course of electrical stimulation26,27,28,29. In some cases, the electrode array pierces the basilar membrane causing some contacts to sit in the scala vestibuli, which results in both fibrosis and sub-optimal electrode position and hence further elevates the resistance30. Similarly, if the distance between the electrode and the target SGCs is increased due to the electrode placement, a higher current level might be needed31. Altogether, the increased spread of excitation reduces the much-needed channel independence in CI hearing.

Neural health

The health of SGCs could be considered in two essential perspectives, the ability to code intensity and frequency. In typical hearing, the dynamic range for encoding loudness is achieved by different SGC populations, with the similar characteristic frequency (CF), having different firing characteristics and firing at different parts of the intensity range32. The frequency information is conveyed by both the CF of the firing population and the spiking rate of it33. However, this is challenging for CI hearing since the degeneration and or loss of SGCs, combined with the broad stimulation, makes precise control over the stimulation pattern and rate unachievable. Thus, the stimulated population is often comprised of many populations that convey different information in a typical hearing setting. The mechanism for loss and degeneration of SGCs is not fully understood but it is associated with ageing, deafness, hearing trauma, and exposure to ototoxic drugs34,35,36. The general trend of degeneration occurs from dendrites to the cell body37,38 and from the base to the apex of the cochlea39. The loss/degeneration of SGCs leads to fewer steps between the lowest and loudest perceivable loudness creating a narrower dynamic range that could be challenging in speech perception40. In some cases, the total loss of SGCs within a frequency band creates a ‘dead region’ (a spectral hole) on the frequency map41,42. Increasing current level would not resolve the dead region but introduce off-frequency hearing. That is, the stimulated SGCs are not the intended recipients determined by the CI coding strategy. In addition, the elevated current level needed due to the loss/degeneration of SGCs can affect the channel independence.

Insertion depth and frequency to place mismatch

The SGCs are organised tonotopically (Fig. 1). The ones with a lower CF are located at the apical part of the cochlea43,44. The electrode array is typically inserted into the Scala Tympani with an insertion depth (commonly measured in mm away from round window or degree angle starting from 0 clockwise) around 360–540 degrees covering only a portion of the whole 2.5 turns of the healthy cochlea45. Thus, the depth of insertion could affect the distance between the electrode and the designated SGC population. The difference between the CF of a channel and the CF of the stimulated SGCs is often referred to as the Frequency-to-Place Mismatch (FTPM). FTPM is often estimated using imaging tools (as discussed below) or electrophysiological methods46.

Viability

In this review, we refer to the term, viability of the ENI, as of the effectiveness of information transmission due to neural, non-neural and mixed factors (e.g., the health of the auditory nerve, electrode placement and the spread of excitation due to both). The more “viable” the ENI is, the better the information transmission will be.

Estimating the viability of ENI

A range of approaches have been proposed to estimate the viability of the ENI. Such approaches involve imaging techniques, psychophysical measures (subjective, requiring patients to be able to do the task and respond), and electrophysiological (objective) measures. It is impractical to directly measure the status of the ENI without causing damage. Thus, there are always compromises and trade-offs involved in the proposed ENI measures. In the following paragraphs, approaches are introduced first and followed by the applications on CI mappings.

Imaging tools

Imaging techniques including X-ray and Computed Tomography (CT) with various resolutions are used to estimate the electrode-to-modiolus distance (EMD), determine the location of the electrode, and diagnose insertion trauma12,31,47,48,49,50,51,52,53,54,55,56. Knowing these attributes, researchers proposed methods of estimating the FTPM and channel interactions.

Psychophysical measures of ENI viability

The psychophysical measures of the ENI use detection and discrimination tasks that can explain and quantify degrees of channel independence and the effectiveness of processing sounds falling within the same channel.

Several electrode discrimination tasks have been used to locate electrodes or regions of electrodes that are not perceptually distinct from one another57,58,59,60,61. The common paradigms often include stimulating different electrodes to test if they are discriminable57,61. Another paradigm involves a pitch ranking task sequentially stimulating a pair of electrodes and ask the participant to indicate the one with a higher pitch. An pair of electrodes is considered discriminable if the correct rate of identification is reached within certain number of trials58,59,60. In this review, we refer both paradigms as to an electrode discrimination task despite the difference between identification (stating high as high is considered correct) and discrimination (stating the difference itself warranties correctness). An indiscriminable electrode potentially suggests significant stimulation overlap with neighbouring electrodes. It is worth noting that discrimination tasks could also be cued by the change of loudness. As a result, the successful discrimination between electrodes could be partially informed by the change of loudness instead of the change of perceived pitch.

(Frequency) tuning curves have been used to estimate channel interactions in several studies62,63,64. The probe-detection thresholds for different masker-probe combinations are plotted to form the tuning curve indicating the change in magnitude of the threshold for each masker. This provides a measure of frequency selectivity and broadness of current spread.

Zhou65 reported that the detection threshold at a lower pulse rate, 80 pulses per second (pps), correlated more closely (adjusted R squared of 0.526) with the spatial selectivity (measured with a forward masking paradigm such as the tuning curve method) compared with the threshold measured at the conventional stimulation rate of 900 or higher pps and hence proposed the use of low rate threshold as a practical option of measuring the spatial selectivity.

Tripolar stimulation was proposed to enhance the spatial selectivity by applying a biphasic pulse to one stimulation electrode, and applying an opposite polarity biphasic pulse (reduced in magnitude) to the two electrodes to the right and left of the stimulation electrode. Bierer66 measured detection thresholds using tripolar stimulation and found a correlation between the performance on a clinical speech test and the variability in tripolar threshold across electrodes. It was suggested that higher tripolar thresholds were indicative of a poorer neural survival or electrode placement.

Changing the pulse-rate of an electric pulse train at one electrode changes the voltage needed to elicit an audible sensation. Multipulse-integration (MPI) is a measure proposed to describe neural health at a particular electrode site (within channel). It is measured by estimating the threshold differences between pulse trains with different rates at a fixed duration. Normally, it is reported in terms of slope (change in dB/ change in rate) and a steeper slope reflects better neural survival when studied with animal models67,68,69. In these animal models, it is hypothesised that a site with better neural health (higher population of SGCs and in better status) should be able to respond to the additional pulses better and hence demonstrate a lower threshold. Zhou and Dong70 measured MPI with CI recipients and found that MPI correlated with the spread of excitation measured as the slope of the tuning curve.

The Polarity Effect, the difference between detection thresholds due to the polarity of biphasic or multi-phasic pulses (e.g., anodic versus cathodic first), has been proposed to reflect the status of the neural health. Based on modelling works, it is predicted that the anodic leading pulses excite more degenerated or demyelinated peripheral processes of the human auditory nerve71,72. Thus, polarity effects could be a marker of neural degeneration. Studies with human CI users reported this finding consistently over recent decades73,74,75. Furthermore, this effect can also be observed objectively through electrophysiological methods76.

Electrophysiological measures of ENI

Electrically-evoked compound potentials (eCAP), are the temporal responses from a group of SGCs recorded directly from the CI electrodes77. A typical eCAP recording shows a valley followed by one peak (a N1-P1 response). With the increasing stimulation level, the amplitude grows correspondingly. To determine if a response is present, the morphology of the eCAP is analysed. However, the correlation between the eCAP threshold and behavioural thresholds is moderate with a Pearson’s r around 0.5–0.678,79,80. The challenge is potentially related to the eCAPs being measured at a much lower stimulation rate (e.g. 30 Hz) than the clinical coding strategy rate (around 900 pps or higher). eCAP could be used as an alternative to the behaviour response. For instance, polarity effects have been explored with eCAP recordings76. In addition to being an objective alternative, eCAP are used in algorithms that estimate the ENI status automatically. For instance, tuning curves constructed by eCAP recordings across the array with an algorithm modelling the effect of neural health and current spread has been demonstrated81,82. In addition, analysis of eCAP recordings was proposed to identify spread of excitation and was reported to correlate with speech perception83.

Several eCAP studies reported that the amplitude grows as a function of the gap between the two phases of a pulse (termed interphase gap effect, IPG). Studies with rodents84,85 have shown a correlation between the IPG effect and neural density. Similarly, studies with CI users showed correlation between speech perception and the IPG effect.

Interventions informed by ENI

If the ENI is extensively damaged it could be a bottleneck for the transmission of sound14. Many of the measures that have been explored have not yet been used to re-map CIs but all ENI measures indicate high variability between participants and, more importantly, within participants across electrodes. However, clinicians often use the same approach to mapping for all patients: eCAP-Thresholds for infants, and uncooperative adults, subjective comfortable and threshold values for the reminder of patients while electrodes causing pain or facial nerve stimulation are reduced in stimulation level or disabled. This approach is efficient clinically, yet, leaving room for improvement. The mapping approaches informed by an ENI measure that have been investigated are typically either electrode deactivation or frequency re-allocation of the channels.

Electrode deactivation

Several publications showed the result of deactivating electrodes with a poor ENI56,58,59,60,86,87,88,89,90,91,92,93,94,95,96. For a typical electrode deactivation intervention (also termed site selection strategy), the researchers estimate the ENI status of each electrode and deactivate the ones with poor ENI status. The tool adopted to estimate ENI and the rules used to deactivate electrodes varies between approaches. Yet, they shared the same objective of reducing channel interactions assuming channels with poor ENI status are more likely to introduce distortion, off-frequency listening and interference.

Adjustments of frequency allocation

Changes of the frequency allocation table were seen in several articles. Two articles tailored the frequency allocation for each CI user based on responses from discrimination tasks97,98. Others aimed at reducing FTPM to have a better match to the CF of the stimulated SGCs48,49,50,52,55,99,100,101. Interventions of this type starts with an estimate of electrode location. Next, the location is transformed into CF frequency using different mapping functions43,44. Lastly, the frequency allocation is changed aiming to reduce the FTPM.

Research objectives

It is unclear which intervention might be most helpful for the different ENI issues. To identify promising interventions, this review evaluated research articles on re-mapping that were informed by ENI measures. A Participant, Interventions, Comparison, Outcome, Study (PICOS102) method was adopted to answer the following question:

Do interventions informed by individual ENI measures improve speech outcomes in adult CI users compared with the clinical settings?

A meta-analysis was conducted to quantify the key findings of the included articles. In addition to the main question, the interaction between ENI measures and interventions was also explored.

Method

Data sources and inclusion criteria

A systematic review registered with PROSPERO (international prospective register of systematic reviews, ID CRD42021292483) was performed in accordance with the PRISMA 2020 statement103. The study protocol was designed and registered before starting to ensure the transparency. Six databases, MEDLINE, EMBASE, TRIP, Scopus, Web of Science, and CINAHL, were searched in April 2024. Inclusion criteria for articles were guided by the PICOS question. Participant refers to the population of interest. Adult CI users were selected because these are the group who can reliably respond in psychophysical tasks; also, most literature on remapping has been conducted with adults. Interventions looked at changes in CI settings and training regimes that were specifically informed by ENI measures, not requiring surgical or pharmaceutical approaches. Comparator was set as the daily-used clinical map to provide a scenario closer to the practice. Outcomes of interest were speech perception in quiet and noise and other correlated tests (e.g., spectral ripple, spectral-temporally modulated test). Studies to be included were restricted to primary human studies so that each experiment contributes as one piece of evidence only. The search term for the databases was defined by the combination of CI, ENI measures, and desired outcome measures (Fig. 2). See appendix A for the exact search terms used in different electronic databases.

Fig. 2
Fig. 2The alternative text for this image may have been generated using AI.
Full size image

Search terms. The asterisk stands for any character in the search bar (e.g., implant* gives results from implant, implantation … etc.). The terms are separated by a comma. Within each box, terms are combined with the Boolean OR. Between boxes, terms are combined with the Boolean AND.

Selection process

Duplicates were removed from the articles arising from the search. The list was processed against the inclusion criteria by the corresponding author using titles and abstracts. The eligible publications then went through the data extraction and Risk of Bias (RoB) assessment. Each reviewer had the opportunity to exclude a publication during the RoB assessment if a previously missed detail justified the removal of a study (four were removed at this stage).

Risk of bias assessment

Three reviewers (JL, BW, DV) conducted the assessment independently. A questionnaire was tailored to consider the strength of evidence from the articles (see appendix B for the questionnaire and marking guideline). Questions tailored to CI intervention studies were created and some questions addressing the research question, confounds, and participant recruitment based on the CASP (Critical Appraisal Skills Programme) checklists were adopted104: Each article was first assessed using nine domains (Study Design, Participant Selection, Presence of Control Group, Learning Effect, Adaptation/Acclimatisation Period, Outcome Measure, Statistical Biases, Conflict of Interest, and Reporting) and a global score was assigned based on the responses to each aspect. The marking rationale was adapted from ROBINS-I (for comparing the intervention study with a target trial (ROBINS-I by Cochrane105). The target trial was an imaginary trial that answers the same research question in the best way possible. If the assessed article answers all questions in a domain comparably to the target trial, it is marked as a low RoB for that domain. Possible scores were low, moderate, and high. Whenever the three reviewers’ responses contained one each of the extremes (high and low), this counted as a conflict and the findings were discussed to reach consensus. For all other cases the most frequently occurring score was used. The global score was determined by the following rules, applied one after the other. Rule 1, if 2 or more domains were marked ‘High’, the global score was marked ‘High’. Rule 2, if the first rule didn’t apply, the global score was marked ‘Low’ when six or more domains were marked ‘Low’. Rule 3, if the two previous rules did not apply, the global score was marked ‘Moderate’.

Data extraction

The corresponding author extracted the following data items: study design, sample size (in number of ears and participants), intervention approach, adaptation period (to a new map), details of the comparator, speech in noise performance, and speech in quiet performance. For speech in noise outcomes, articles commonly reported dB SNR (signal-to-noise-ratio) needed for 50% correct, or the percentage correct at one or more SNRs tested with sentences and/or words. If results were reported for both sentence and word tests, the score of the sentence test was used for analysis. In the case where unilateral and bilateral results were both reported, the unilateral result was extracted. The mean and standard deviation of improvements were extracted. If the article did not report them explicitly, the corresponding author inferred the standard deviation (SD) using the reported statistics when applicable (E.g., using the number of participants, mean difference, and T-statistics to infer the SD). To obtain individual data, a web-based application (WebPlotDigitizer v4) was used to extract data points from the published figures106. The corresponding authors were contacted for raw data if the key statistics could not be computed with the extracted data alone.

Study designs

CI mapping studies are a flexible intervention approach in medical research. Unlike many pharmaceutical or surgical interventions, the change of map is reversible in most cases. The clinical map was typically the control condition, and the experimental map was the intervention. The designs that compared the performance of maps within the same participant were classified as ‘Crossover’ if more than one experimental intervention was tested or ‘Single Intervention’ if only one condition was trialled per participant. ‘RCT’ was added if the study design included a control group, and the group assignment was randomized. ‘Longitudinal Single Intervention’ study measures the outcomes at different times from the same group of participants. ‘Cross-sectional’ study is a type of observational study that measures the outcomes at one point in time on the group of participants meeting the inclusion criteria.

Data synthesis

Included articles differed largely in outcome measures, study design, intervention method and publication year. Therefore, a random-effects model was adopted for the meta-analysis to compute a standardised mean difference (Cohen’s d) and weighting the outcomes using both the within and between study variances107,108,109. In this view, we believe that the included articles were different (in ENI measures, interventions, and precision due to study design) but all fall under the definition of interventions informed by ENI measures. To conduct such synthesis, the mean difference (improvement, positive being better) and standard deviation of the differences (SDd) were needed. For self-controlled studies, the SDd was directly computed from the raw data when possible or inferred from the t statistics. For RCT studies, the SDd was computed as the pooled SD between the control and experimental groups.

Meta-regressions were conducted to further explore the interplays between the interventions and ENI measures. If a study was marked having a high risk of bias or the mean and SD were unavailable, the study was excluded from the meta-analysis. Computations were conducted using a custom script written in Python3 with SciPy version 1.10.1110 and NumPy version 1.26.4111. In addition, publication bias was analysed using a funnel plot.

Results

The search resulted in 7435 articles being identified from the six electronic databases. Duplicated (n = 3518) and ineligible records (n = 1713) were removed leaving 2204 articles for screening at title level. After the title screening, 289 articles remained and were checked at the abstract or full-text level to assess relevance. Thirty-three articles remained of which six were further excluded due to having a small sample size101,112, adopting a comparator that is too different from the modern clinical map113, or attempting several interventions and later probe the relationship of the ENI measure instead of being informed while mapping CI98,114,115. Three additional articles were identified from reference lists resulting in 30 included articles (Fig. 3) in total48,49,50,52,54,55,56,58,59,60,61,62,86,87,88,89,90,91,92,93,94,95,96,97,99,100,116,117,118,119.

Fig. 3
Fig. 3The alternative text for this image may have been generated using AI.
Full size image

Consort diagram for article identification. Gray boxes mark each stage of identification. The flow of articles is indicated by the arrows. The first column of white boxes demonstrates the articles remained in each stage. The second column of white boxes show the removed articles.

Profile of the included articles

The characteristics and key findings of the included articles are summarised in Table 1. If a study experimented on more than one intervention map (excluding the inverse controls), the outcome of each map was reported in separate rows. The outcome measures were typically speech tests and these included word, consonant, vowels, and sentences (in noise and quiet). Sentence tests typically adopted, BKB, CUNY, HINT, and IEEE sentences. Speech in quiet measures often consisted of consonant–vowel-consonant and vowel-consonant–vowel tests. The number of participants varied from 3 to 72 (Mean = 14, SD = 13). Most participants were post-lingually deaf adult CI users while two articles62,63,91 included pre-lingually deaf adults (N = 2 and 26 ears respectively) and one article included EAS users49. All but two studies adopted a within-subject paradigm to compare the performance between pre-intervention and post-intervention. Single blinding was rare, and two studies were randomised control trials. Different adaptation periods were seen ranging from several months to acute testing within one session. Most articles fell into two re-mapping areas, (1) channel deactivation to reduce channel interaction (16 articles), and (2) adjustments of frequency allocation (12 articles). Other interventions included increasing T-level119, using a new coding strategy117, change of stimulation mode62, and dichotic but complementary site selection118.

Table 1 Key findings of the included articles.

Risk of bias results

Table 2 presents the RoB outcomes, including the scores of each sub domains for each article. At global level, among the 30 included articles, 19 were marked moderate, 8 were marked low (good quality), and 3 were marked high. Adaptation to the new map was the sub-domain that was most frequently (10 out of 30) marked having a high risk of bias. This result highlighted the need to consider a randomised control trial or to ensure the compared map has similar exposure as the clinical map. Statistical inaccuracy and learning effect were also marked high more frequently (6 out of 30). Some articles were considered high risk due to not reporting the statistical result in-full. In addition, learning effect could introduce biases if the testing order of maps was not randomised. Figure 4 demonstrates the distributions of scores by sub-domain. Overall, most included articles were considered to have a moderate or low risk of bias, which justifies the inclusion to the meta-analysis.

Table 2 Risk of bias results by domain.
Fig. 4
Fig. 4The alternative text for this image may have been generated using AI.
Full size image

Risk of bias scores by sub-domain. For each sub-domain, the left (green), middle (yellow), and right (orange) bars represent the number of articles marked low, moderate, or high risk of bias, respectively.

Publication bias

The funnel plot demonstrated symmetry between the studies reporting a Cohen’s d higher or lower than average (Fig. 5). An Egger’s test was conducted to test if there’s a relationship between the standardised effect size and the inverse of the standard error of the Cohen’s d. The result revealed a slope that was not significantly different from zero (r = 0.014, p = 0.95) and hence supporting it was more likely than not that the publication bias was not present.

Fig. 5
Fig. 5The alternative text for this image may have been generated using AI.
Full size image

Relationship between the reported outcome and study precision. Y axis represents the precision of the study using the scale of standard error of Cohen’s d (smaller the standard error, higher the precision). X axis represents the reported effect size. Each black dot represents a study (more participants and smaller within-study variance give higher precision). The dashed line shows the combined Cohen’s d (0.56). This plot is often recognised as the funnel plot for assessing publication bias. If articles with a positive finding were favoured at publishing, there should be more studies on the right of the dashed line (combined Cohen’s d at 0.48). If a study is truly precise, the reported outcome should be closer to the ground truth, which should be around the combined effect size.

Meta-analysis

Do re-mapping interventions informed by ENI measures improve speech perception performance?

Speech in Noise Performance. A meta-analysis was conducted to assess the overall effect of re-mapping interventions informed by ENI measures on speech in noise performance (Fig. 6a). The combined effect size, from 21 studies with 305 ears in total, was 0.48 with a 95% confidence interval ranging from 0.1 to 0.87, indicating a moderate and statistically significant (p = 0.014) effect. Heterogeneity was significant (Cochran’s Q = 93.09, df = 20, p < 0.001) indicating differences in effect size across studies.

Fig. 6
Fig. 6The alternative text for this image may have been generated using AI.Fig. 6The alternative text for this image may have been generated using AI.
Full size image

(a) Forest Plot for Speech in Noise Outcomes. Black error bars mark the 95% confidence interval. Blue dots mark the individual Cohen’s d. The pink diamond marks the combined Cohen’s d. (b) Forest Plot for Speech in Quiet Outcomes. Black error bars mark the 95% confidence interval. Blue dots mark the individual Cohen’s d. The pink diamond marks the combined Cohen’s d.

Speech in Quiet Performance. A meta-analysis was conducted to assess the overall effect of re-mapping interventions informed by ENI measures on speech in quiet performance (Fig. 6b). The combined effect size, from 14 studies with 227 ears in total, was 0.43 with a 95% confidence interval ranging from 0.14 to 0.72, indicating a moderate and significant (p < 0.01) effect. Heterogeneity was again significant (Cochran’s Q = 24.91, df = 13, p = 0.023) indicating differences in effect size across studies.

Does speech in noise performance improvement come at the cost of speech in quiet performance?

Clinicians would be interested in understanding if re-mapping results in an improvement in speech in noise perception; if so, are there any negative consequences, such as deterioration in speech in quiet performance. For instance, switching off electrodes with poor ENI viability could reduce channel interactions, but result in fewer active channels. A linear regression was conducted to assess the relationship between the speech in noise and quiet performances (Fig. 7). Among the 12 articles that reported performance for both speech testing conditions52,54,56,58,60,62,86,89,90,92,93,96, while one outlier was removed due to only three out of the seven participants were able to complete speech tests under both conditions, a positive and statistically significant correlation was seen between the average speech in noise and quiet performance (r = 0.89, p < 0.001) indicating that the performances tend to improve concurrently.

Fig. 7
Fig. 7The alternative text for this image may have been generated using AI.
Full size image

Relationship between speech in noise and speech in quiet performances. The axes represent the improvement provided by the intervention in effect size (positive is better); the dashed and solid lines are the line of best fit, with or without the outlier. Each black dot represents an article. The Henshall and Mckay90 paper was a leader in the field and three of the seven participants reported both speech in noise and quiet results while other studies were conducted ranging from 2013 to 2023 with a higher number of participants.

Performance of the site selection interventions

Speech in Noise Performance. A meta-analysis was conducted to assess the overall effect of site selection (deactivation) interventions informed by ENI measures on speech in noise performance (Fig. 8a). The combined effect size, from 14 studies with 217 ears in total, was 0.59 with a 95% confidence interval ranging from 0.18 to 1, indicating a moderate and statistically significant (p = 0.005) effect. Heterogeneity was significant (Cochran’s Q = 45.62, df = 13, p < 0.001) indicating differences in effect size across studies.

Fig. 8
Fig. 8The alternative text for this image may have been generated using AI.Fig. 8The alternative text for this image may have been generated using AI.
Full size image

(a) Forest Plot for Site Selection Interventions, Speech in Noise. Black error bars mark the 95% confidence interval. Blue dots mark the individual Cohen’s d. The pink diamond marks the combined Cohen’s d. (b) Forest Plot for Site Selection Interventions, Speech in Quiet. Black error bars mark the 95% confidence interval. Blue dots mark the individual Cohen’s d. The pink diamond marks the combined Cohen’s d.

Speech in Quiet Performance. A meta-analysis was conducted to assess the overall effect of site selection (deactivation) interventions informed by ENI measures on speech in quiet performance (Fig. 8b). The combined effect size, from 10 studies with 165 ears in total, was 0.29 with a 95% confidence interval ranging from 0.02 to 0.55, indicating a small and significant (p = 0.036) effect. Heterogeneity was not significant (Cochran’s Q = 10.99, df = 9, p = 0.28).

Differences between ENI Measures. To explore whether site selection performs better while being informed by a specific ENI measure, a weighted least square regression was conducted using the effect size as the dependent variable, the ENI measure as the independent variable (categorical) and the inverse of variance (standard error of Cohen’s d squared) as the weight (Table 3). Among CT Image, ECAP IPG Effect, Polarity Effect, Electrode Discrimination, Low-rate Threshold, Modulation Detection Threshold, Partial Tripolar Threshold, and Multi-Dimensional Pitch Scaling, three ENI measures demonstrated significantly better speech in noise performance. Site selection intervention informed by the low-rate threshold had an effect size that was 1.66 above average (t = 3.34, p = 0.016, 97.5% Confidence Interval: 0.44–2.87). Similarly, modulation detection threshold had an effect size that was 1.37 higher than average (t = 2.89, p = 0.028, 97.5% Confidence Interval: 0.21–2.54). Meanwhile, electrode discrimination yielded an effect size that was 0.97 better (t = 3.11, p = 0.021, 97.5% Confidence Interval: 0.21–1.73).

Table 3 Weighted least square regression assessing the more suitable eni measures for site selection intervention.

Performance of the FTPM reduction interventions

Speech in Noise Performance. A meta-analysis was conducted to assess the overall effect of reducing FTPM informed by ENI measures on speech in noise performance (Fig. 9a). The combined effect size, from 6 studies with 77 ears in total, was 0.47 with a 95% confidence interval ranging from − 0.46 to 1.4, indicating a moderate effect size that was not statistically significant (p = 0.32) effect. Heterogeneity was significant (Cochran’s Q = 35.33, df = 5, p < 0.001) indicating large differences in effect size across studies.

Fig. 9
Fig. 9The alternative text for this image may have been generated using AI.
Full size image

(a) Forest Plot for Interventions Adjusted the Frequency Allocation, Speech in Noise. Black error bars mark the 95% confidence interval. Blue dots mark the individual Cohen’s d. The pink diamond marks the combined Cohen’s d. (b) Forest Plot for Interventions Adjusted the Frequency Allocation, Speech in Quiet. Black error bars mark the 95% confidence interval. Blue dots mark the individual Cohen’s d. The pink diamond marks the combined Cohen’s d.

Speech in Quiet Performance. A meta-analysis was conducted to assess the overall effect of reducing FTPM informed by ENI measures on speech in quiet performance (Fig. 9b). The combined effect size, from 4 studies with 62 ears in total, was 0.65 with a 95% confidence interval ranging from − 0.01 to 1.31. Similarly, the combined effect size was large yet with a wide confidence interval indicating large but statistically near significant (p = 0.055) effect.

Discussion

Re-mapping interventions informed by the ENI viability, despite being generally effective, resulted in a wide variety of outcomes. There is no clear and convincing pattern to indicate which interventions are most successful. Here we discuss the explanations of the reported findings and the potential interplay between factors.

ENI measures suitable for site selection interventions

Electrode deactivation informed by measures intended to estimate neural health such as polarity effects & ECAP IPG measures are less successful (Table 3) than methods intended to estimate spread of excitation such as low-rate threshold, modulation detection threshold, and electrode discrimination. A possible explanation for this difference is that deactivating an electrode with a good neural survival and large spread of excitation can reduce channel interactions at the cost of not stimulating the healthy SGCs at low current level. By contrast, if the situation reverses, deactivating a site with poor neural health and small spread of excitation, there is no room of improvement in terms of channel interactions. For sites with poor neural health, increasing the T-level could recruit the neighbouring SGCs, which could result in a more discriminable stimulation pattern. Zhou and Pfingst119 increased the T-level at sites with a poor MDT and indeed reported improved performance.

Imaging tools were proposed to inform the site selection. However, the effect size did not differ significantly from the average of all site selection interventions (Table 3). On the other hand, the best three ENI measures for site selection interventions, low-rate threshold, modulation detection threshold, and electrode discrimination, were all behavioural tasks that focus on an important aspect of CI hearing. Electrode discrimination tests the channel independence. Modulation detection forms the backbone of CI hearing since the envelope information is coded as the modulation of a pulse train.

Some challenges arise when deactivating channels. If deactivation results in an increase in pulse rate this could influence performance and should be checked, usually pulse width can be adjusted to avoid a pulse rate increase while holding the current level and loudness similar. Following deactivation, default behaviour of the clinical software often splits the frequency band to adjacent electrodes. This setting was used across most included deactivation studies. Other possible approaches are dropping the band (leaving a spectral hole), assigning to the basal electrode, or assigning to the apical electrode. It was reported that the four approaches did not differ significantly, but the location and number of holes are important to speech understanding120. Simulations also suggested that the warped frequency representation can be adapted to after trainings121. Further work is required to probe the mechanisms behind the observed interactions between ENI measures and the most suitable mapping rationale.

Effect of reducing FTPM on speech perception

The effect of reducing FTPM was less robust when tested acutely and demonstrated a wide range of performance (Fig. 9a). The main source of positive effect size originated from a randomised control trial52 with tonal language speakers. This contrast between studies might indicate that the benefit of reduced FTPM is only perceivable after a long adaptation period and/or such benefit is more salient for the tonal language speakers. In addition, the insertion depth and electrode design could also play vital roles. One study, with Advanced Bionics CI users, was not included in the meta-analysis despite the low risk of bias due to lacking ways of computing a standardised mean difference from the published information55. The study reported significantly worse speech in noise performance while re-mapping to reduce FTPM. Further research is needed to clarify the interplays between the reduced FTPM and the improvements in speech perception. Another challenge presenting with this approach is that, since the electrode array doesn’t always cover the whole cochlea, reducing FTPM often involves truncating certain degree of the lower frequencies.

Statistical power

In this review, many interventions significantly improved performance for a subset of individual’s speech in noise perception without being effective at group level. This could be partly explained by studies being underpowered. For a paired t-test, to reach a statistical power of 0.8 at 0.05 significance level, two-tail, and assuming an effect size of 0.5, 33 participants are needed (computed in Python version 3.10.12 using Statsmodels version 0.14.0122;). Very few articles had a sample size larger than 33. In addition, the mean sample size for the articles reviewed here is 15 and hence a large effect size of 0.7 or more is required to reach significance. As a result, among the included articles, many false negatives could be expected due to the experiments being underpowered.

Confounding factors in CI studies

Different coding strategies could interact with the interventions differently. For instance, electrode deactivation informed by electrode discrimination showed improvement at group level58. However, the effect disappeared when the users were all ACE users59. As ACE is only stimulating N electrodes within a cycle, deactivating electrodes could have a smaller influence, than when all channels are active. Individual differences such as cognitive function, listening experience, hearing loss etiology, and other demographics could also confound with the performance. Changes to CI mapping may improve one aspect of sound delivery but could be detrimental to other aspects (e.g., detection of sound could be improved but the frequency allocations could distort speech cues). Adjustment of frequency allocation could introduce different degrees of frequency compression or expansion at different sites along the array which could distort speech cues. The adaptation period after introducing a new map could also affect the outcome. Many experiments were conducted under an acute setting, which might not show the full potential of an intervention.

Using randomised controlled trials with good sample sizes could help disentangle the confounding factors. Alternatively, cross-over designs while carefully controlling the order of experimental maps could serve as a viable option if RCT is not feasible. Overall, the same intervention could have very different outcomes across individuals and hence it is important to consider individual factors while generalising the experimental findings to the clinical practice.

Limitations of the review

Initial screening of the articles was conducted by a single reviewer following very clear criteria. The reviewer took caution to avoid false exclusion by making use of abstract and full-text level. The screened list of articles was crossed checked against reference lists to ensure that key articles were not omitted, and the two additional reviewers also checked the reference lists to ensure that relevant articles were captured. Many of the specific interventions were from one lab group so replication of findings would be beneficial. We found unexplained missing data or incomplete statistical reports in approximately one quarter of the publications.

Conclusion

Re-mapping interventions informed by ENI measures yielded a moderate and significant effect size of 0.48 for improvements in speech in noise perception. Reducing FTPM by matching the frequency allocation to the estimated characteristic frequencies informed by imaging tools showed a similar effect size of 0.47 but was not statistically significant due to the large variability across studies reflected by a wide confidence interval covering zero effect size. Site selection interventions yielded a moderate and significant (p = 0.005) effect size of 0.59 on speech in noise performance. Site selection interventions were more successful when informed by the low-rate threshold, modulation detection threshold, and electrode discrimination, yielding large effect sizes around 1–1.5. Replications of studies are much needed since the same ENI measure is often reported by the same research group. Further work is required to customise the approaches to meet the needs of individual CI recipients. Overall, the viability of ENI can serve as a great information source for guiding mapping interventions.