Acoustic Model

Also known as: AM

An acoustic model is the component of an automatic speech recognition (ASR) system that maps short segments of audio (typically 10–25 ms frames of spectral features) to the linguistic units that produced them — most commonly phonemes or sub-phonetic states. Classical acoustic models use Hidden Markov Models with Gaussian mixture output densities; modern systems use deep neural networks, recurrent or transformer architectures, and end-to-end models that fold acoustic and language modelling together. For accessibility, the acoustic model is the part of an ASR system most sensitive to atypical speech — dysarthria, deaf speech, accented speech, child speech, and so on — and is the usual target of speaker adaptation, fine-tuning, and personalised-voice efforts intended to make mainstream voice assistants and dictation tools usable by disabled speakers.

Category: Speech Technology · Machine Learning · Speech Recognition · Assistive Technology

Related: Automatic speech recognition · Hidden Markov Model · Phoneme · Speech Recognition · Dysarthria

Sources

https://en.wikipedia.org/wiki/Acoustic_model