Extended Data Fig. 6: Encoding of paralinguistic features in neural activity. | Nature

Extended Data Fig. 6: Encoding of paralinguistic features in neural activity.

From: An instantaneous voice-synthesis neuroprosthesis

Extended Data Fig. 6: Encoding of paralinguistic features in neural activity.The alt text for this image may have been generated using AI.

a. Neural modulation during question intonation. Trial-averaged normalized spike-band power (each row in a group is one electrode) during trials where the participant modulated his intonation to say the cued sentence as a question. Trials with the same cue sentence (n = 16) were aligned using dynamic time warping and the mean activity across trials spoken as statements was subtracted to better show the increased neural activity around the intonation-modulation at the end of the sentence. The onset of the word that was pitch-modulated in closed-loop is indicated by the arrowhead at the bottom of each example. b. Paralinguistic features encoding recorded from individual arrays. Trial-averaged spike-band power (mean ± s.e.m.), averaged across all electrodes within each array, for words spoken as statements and as questions. At every time point, the spike-band power for statement words and question words were compared using the Wilcoxon rank-sum test. The blue line at the bottom indicates the time points where the spike-band power in statement words and question words were significantly different (P < 0.001, n1 = 970 words, n2 = 184 words). c. Trial averaged spike-band power across each array for non-emphasized and emphasized words. The spike-band power was significantly different between non-emphasized words and emphasized words at time points shown in blue (P < 0.001, n1 = 1269 words, n2 = 333 words). d. Trial-averaged spike-band power across each array for words without pitch modulation and words with pitch modulation (from the three-pitch melodies singing task). Words with low and high pitch targets are grouped together as the ‘pitch modulation’ category (we excluded medium pitch target words where the participant used his normal pitch). The spike-band power was significantly different between no pitch modulation and pitch modulation at time points shown in blue (P < 0.001, n1 = 486 words, n2 = 916 words). e. Confusion matrix showing offline accuracies for decoding question intonation and word emphasis paralinguistic features together using a single combined 3-class classifier.

Back to article page