Fig. 1: Feature spaces used for voxelwise modeling.
From: Phonemic segmentation of narrative speech in human cerebral cortex

To identify the cortical representation of features important for speech comprehension, the sound waveforms from the stimuli were first transformed into six different feature spaces. Acoustic features were extracted by creating separate spectrum power and phoneme count feature spaces. These two acoustic feature spaces describe brain activity that can be explained by the mere presence or absence of speech sounds. Phoneme-related features were extracted by creating separate feature spaces that reflect single phonemes, diphones and triphones. Semantic features were extracted by creating a feature space based on a 985-dimensional word embedding space4. Cortical Processing: Six distinct feature spaces were used to investigate the Acoustic, Phonemic and Semantic processing in the speech cortical network. Feature Representations: Illustration of the time course of a single feature from each feature space: the spectrum of sound signal in the frequency band centered at 2801 Hz, the phoneme (/ai/), the diphone (/m.ai/), the triphone (/w.ah.s/), and the semantic co-occurrence with “I''. These signals are low-pass filtered to generate the continuous values discretized at the TR (bold colored lines). Feature Vectors: Illustration of matrices of each feature space for 5 TRs and a subset of the features in each feature space. Models: Venn diagrams illustrating the features used in nested voxelwise models (VMs). The Acoustic Baseline VM used the spectrum power and the phoneme count as features. After subtracting the predictions from the Acoustic Baseline VM, nested models using phonemic and semantic features (green and pink circles at bottom) were fitted to localize phonemic regions and phonemic to semantic cortical boundaries. Nested models using single phonemes, diphones and triphones were fitted to assess the granularity of the phonemic segmentation in phonemic regions.