I want to build an app that responds to the sound you make when blowing out birthday candles. This is not speech recognition per se (that sound isn't a word in English), and the very kind Halle over at OpenEars told me that it's not possible using that framework. (Thanks for your quick response, Halle!)
Is there a way to "teach" an app a sound such that the app can subsequently recognize it?
How would I go about this? Is it even doable? Am I crazy or taking on a problem that is much more difficult than I think it is? What should my homework be?
The good news is that it's achievable and you don't need any third party frameworks—AVFoundation
is all you really need.
There's a good article from Mobile Orchard that covers the details, but somewhat inevitably for a four year old, there's some gotchas you need to be aware of.
Before you begin recording on a real device, I had need to set the audio session category, like so:
[[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryPlayAndRecord error:nil];
Play around with the threshold in this line:
if (lowPassResults > 0.95)
I found 0.95
to be too high and got better results setting it somewhere between 0.55 and 0.75. Similarly, I played around with the 0.05
multiplier in this line:
double peakPowerForChannel = pow(10, (0.05 * [recorder peakPowerForChannel:0]));