Fig. 2

Typical data quality of 2D real-time speech imaging, shown in mid-sagittal image frames from three example participants: (a) sub35 (male, 21 yrs, native American English speaker), (b) sub51 (male, 33 yrs, non-native speaker), (c) sub58 (female, 28 yrs, non-native speaker). The mid-sagittal image frames depict the event of articulating the fricative consonant [θ] in the word “uthu” (stimulus “vcv2”), where the tongue tip contacts the upper teeth. (a) and (b) are considered to have very high quality, based on high SNR and no noticeable artifact. (c) is considered to have moderate quality, based on good SNR and mild image artifacts; the white arrows point to blurring artifacts due to off-resonance while the yellow arrows point to ringing artifacts due to aliasing.