Last time I checked, the Google thing is a hack to mimic what the browser voice recognition does, not sure about the Amazon and Google online APIs, definitely checking that out.
I started a project last year or so where I wanted voice recognition on Linux, using only an offline engine, that's how I ended up at PocketSphinx.
It is not a hack: the browser interfaces are built on top of that, it is separate and FPC comes with the necessary interfaces by default. And these work on linux too....
If you mean you need to be on-line?
true. PocketSphinx is limited and if your platform is Windows, use the MS api's off-line.
Also note there was already crude but working SR software for the Commodore 64...
Not rocket science to experiment with a DFT - using fixed length math -(FFT is not necessary and too slow on a 6510) and a look-up and a trainer to match/fill the look-up right (average weight).
It becomes rocket science to me when you try to match current speech recognition...I have not been able to even comprehend that....