Fig. 1 | Scientific Data

Fig. 1

From: The “Narratives” fMRI dataset for evaluating models of naturalistic language comprehension

Fig. 1

Schematic depiction of the naturalistic story-listening paradigm and data provenance. (a) At bottom, the full auditory story stimulus “Pie Man” by Jim O’Grady is plotted as a waveform of varying amplitude (y-axis) over the duration of 450 seconds (x-axis) corresponding to 300 fMRI volumes sampled at a TR of 1.5 seconds. An example clip (marked by vertical orange lines) is expanded and accompanied by the time-stamped word annotation (“I began my illustrious career in journalism in the Bronx, where I worked as a hard-boiled reporter for the Ram…”). The story stimulus can be described according to a variety of models; for example, acoustic, semantic, or narrative features can be extracted from or assigned to the stimulus. In a prediction or model-comparison framework, these models serve as formal hypotheses linking the stimulus to brain responses. At top, preprocessed fMRI response time-series from three example voxels for an example subject are plotted for the full duration of the story stimulus (x-axis: fMRI signal magnitude; y-axis: scan duration in TRs; red: early auditory cortex; orange: auditory association cortex; purple: temporoparietal junction). See the plot_stim.py script in the code/ directory for details. (b) At bottom, MRI data, metadata, and stimuli are formatted according to the BIDS standard and publicly available via OpenNeuro. All derivative data are version-controlled and publicly available via DataLad. The schematic preprocessing workflow includes the following steps: realignment, susceptibility distortion correction, and spatial normalization with fMRIPrep; unsmoothed and spatially smoothed workflows proceed in parallel; confound regression to mitigate artifacts from head motion and physiological noise; as well as intersubject correlation (ISC) analyses used for quality control in this manuscript. Each stage of the processing workflow is publicly available and indexed by a commit hash (left) providing a full, interactive history of the data collection. This schematic is intended to provide a high-level summary and does not capture the full provenance in detail; for example, derivatives from MRIQC are also included in the public release alongside other derivatives (but are not depicted here).

Back to article page