Table 5 Zero-shot transfer test on naturalistic speech mixtures spatialized with back-microphone HRTFs. Depicted is the score for each metric, as well as the difference between the scores for the mixtures spatialized with the back-microphone HRTFs and the scores for the mixtures spatialized with the T-microphone HRTFs (that is, as depicted in Table 1; here, difference = T-microphone score – back microphone score).

From: Leveraging spatial cues from cochlear implant microphones to efficiently enhance speech separation in naturalistic listening scenes

Input

Spatial cues

SI-SDRi

STOI

STOIi

PESQ

PESQi

One-channel

Latent

7.15 (+0.09)

0.77 (-0.01)

0.14 (+0.01)

2.29 (+0.06)

0.46 (+0.00)

Two-channel, bilateral

Latent

7.86 (+0.07)

0.79 (+0.00)

0.16 (+0.00)

2.36 (+0.00)

0.53 (+0.00)

Two-channel, bilateral, IPD

Latent & pre-computed

9.19 (+0.00)

0.82 (+0.00)

0.19 (+0.00)

2.59 (+0.00)

0.75 (+0.00)

Two-channel, bilateral, ILD

Latent & pre-computed

8.40 (-0.40)

0.80 (-0.01)

0.17 (+0.00)

2.46 (-0.06)

0.62 (-0.06)

Two-channel, bilateral, IPD, ILD

Latent & pre-computed

8.63 (+0.11)

0.81 (+0.00)

0.18 (+0.01)

2.52 (+0.03)

0.69 (+0.02)