Downes.ca ~ Hello World in Speech Recognition

Hello World in Speech Recognition

Apoorv Nandan, Medium, Aug 30, 2019
Commentary by Stephen Downes

This article is a bit complex (since it shows you how to write your own speech recognition system) but at the same time it's simple enough to allow us to grasp the essentials of one of the most intractable problems in learning: language. Let's take the simplest case possible: did a person say 'cat'? We solve this in two stages: we convert the audio to a feature matrix of digital data points called a 'spectrogram', then we feed this an input into a neural network that calculates the probability that we did indeed hear the word 'cat'. That's the gist of it. Everything else - how we prepare the audio, how we train the neural network, how we interpret the results - is detail. Messy messy detail.

Today: 0 Total: 399 [Direct link] [Share]

View full size