Fig. 1: Speech-recognition approaches. | Nature

Fig. 1: Speech-recognition approaches.

From: Analogue speech recognition based on physical computing

Fig. 1

a, When two frequencies f1 and f2 (f2 > f1) enter the human ear, distortion products, such as 2f1 − f2 are generated due to nonlinear active feedback in the cochlea52. Hair cells connected to the auditory nerve endings (1 to n) convert the incoming time-domain acoustic signal into frequency-domain electrical spike-encoded information (features) to be further processed (classified) by the brain. b, Time-frequency digital processing. After analogue-to-digital conversion (ADC), frequency decomposition by a feature-extracting model F(f(t)), such as Lyon’s artificial cochlea model53, is required before classification. c, Time-domain digital processing. Top, in addition to classification, a neural network performs feature extraction by learning (band-pass) filters in the time domain. Bottom, an analogue filterbank extracts frequency features directly from the time-domain analogue signal before classification. d, Reservoir computing. After feature extraction in the time (top) or frequency (bottom) domain, the preprocessed data are used as inputs (in1,..., inN) to a reservoir that increases the dimensionality (represented by φ), simplifying classification. e, Time-domain analogue processing (this work). Reconfigurable nonlinear-processing units (RNPUs) extract temporal features, simplifying classification without the need for extra preprocessing. An analogue in-memory computing (AIMC) chip based on a memristive crossbar array performs the classification.

Back to article page