Extended Data Fig. 1: Microelectrode array placement and brain-to-voice synthesis latencies. | Nature

Extended Data Fig. 1: Microelectrode array placement and brain-to-voice synthesis latencies.

From: An instantaneous voice-synthesis neuroprosthesis

Extended Data Fig. 1: Microelectrode array placement and brain-to-voice synthesis latencies.

a. The estimated resting state language network from Human Connectome Project data overlaid on T15’s brain anatomy. b. Intraoperative photograph showing the four microelectrode arrays placed on T15’s precentral gyrus. Images in a and b are adapted from ref. 1 (Copyright © 2024 Massachusetts Medical Society, reprinted with permission from Massachusetts Medical Society). c. Closed-loop cumulative latencies across different stages in the voice synthesis and audio playback pipeline are shown. Voice samples were synthesized from raw neural activity measurements within 10 ms and the resulting audio was played out loud continuously to provide closed-loop feedback. Note the linear horizontal axis is split to expand the visual dynamic range. We focused our engineering primarily on reducing the brain-to-voice inference latency, which fundamentally bounds the speech synthesis latency. As a result, the largest remaining contribution to the latency occurred after voice synthesis decoding during the (comparably more mundane) step of audio playback through a sound driver. The cumulative latencies with the audio driver settings used for T15 closed-loop synthesis in earlier experiments are shown in dark grey. Audio playback latencies were subsequently substantially lowered through software optimizations (light grey) in latter sessions and we predict that further reductions will be possible with additional computer engineering.

Back to article page