Google released Speech Commands dataset. I see that all the audio files has a sampling rate of 16kHz. Which means that any infomation from 8kHz and up is unreliable (human hearing range 20Hz to 20kHz). This is extremely critical regarding voice recognition, because (not most but) a lot of important data is within the rage of 8khz to 20khz, and losing that means less accuracy and reliability on your voice recognition.
Why does google made a choice of 16kHz ? am i missing some thing ?
Thank you.
This is extremely critical regarding voice recognition, because (not most but) a lot of important data is within the rage of 8khz to 20khz
Actually not, many experiments demonstrate that there is almost no improvement from using higher sample rate. That is why everyone uses 16khz.