Speech Processing in the Auditory Pathway
Project summaryOne of the most critical steps in encoding sound for neuronal processing occurs when the analog pressure wave is coded into discrete nerve-action potentials. Recent pool models of the inner hair cell synapse do not reproduce the dead time period after an intense stimulus, so we used visual inspection and automatic speech recognition (ASR) to investigate a model of enhanced offset adaptation. We found that offset adaptation improved phase locking in the auditory nerve and raised ASR accuracy for features derived from auditory nerve fibers (ANFs). We also found that offset adaptation is crucial for auditory processing by onset neurons (ONs) in the next neuronal stage, the auditory brainstem. A second important finding was that multi-layer perceptrons (MLPs) performed much better than standard Gaussian mixture models (GMMs) for both our ANF-based and ON-based auditory features. As MLPs are also very easy to use in a multi-stream approach, they will facilitate combining features derived from different groups of neurons, which we hope to exploit in the next steps. The electrophysiological experiments in this project found first answers on how our auditory system cope with the fact that, under natural conditions, the statistical properties of the sound entering the ear change dramatically over time (for example, due to changes in intensity). We showed that a model with an intensity-dependent STRF could predict responses to stimuli with varying intensity. Despite the complexity of auditory feature selectivity in the IC, our results provided encouraging evidence that modeling nonlinear responses to complex stimuli is a tractable problem.
|
Offset adaptation enhances speech recognition scores considerably |