speech-recognition speech-to-text tensorflow-datasets

Why Speech Commands dataset by google has a sampling rate of 16kHz

Google released Speech Commands dataset. I see that all the audio files has a sampling rate of 16kHz. Which means that any infomation from 8kHz and up is unreliable (human hearing range 20Hz to 20kHz). This is extremely critical regarding voice recognition, because (not most but) a lot of important data is within the rage of 8khz to 20khz, and losing that means less accuracy and reliability on your voice recognition.

Why does google made a choice of 16kHz ? am i missing some thing ?

Thank you.

Solution

This is extremely critical regarding voice recognition, because (not most but) a lot of important data is within the rage of 8khz to 20khz

Actually not, many experiments demonstrate that there is almost no improvement from using higher sample rate. That is why everyone uses 16khz.

How to add Hotword Detection in python AI
save microphone audio input when using azure speech to text
Getting error while trying to use react speech recognition module in nextjs
Use Vosk speech recognition with Python
RecognizerIntent.EXTRA_LANGUAGE recently doesn't change Recongnizer language
How to set RecognizerIntent for all of the languages
Use the microphone in java for speech recognition with VOSK
ModelCheckpoint not saving the hdf5 file
Speech to text, java speech API, where to find it?
Speech to text c#
Speech to Text on Android
Android Voice Recognition API
good Speech recognition API
Create and use WAV file as an object Python
Speech Recognition - Run continuously
No mic get detected as sound.query_devices() returns empty list?
Detecting a pause of 2 Seconds or more in Speech
android, RecognizerIntent.EXTRA_LANGUAGE_PREFERENCE does not work in speech to text
INVALID_ARGUMENT: Request payload size exceeds the limit: 10485760 bytes
How can I get word-level timestamps in OpenAI's Whisper ASR?
PermissionStatus API: Safari appears to support the change event but nothing fires when user allows microphone
Split speech audio file on words in python
How to run RecognitionListener at the background of the app?
Google Cloud Speech: Distinguish Voices?
How to automatically generate subtitles for a video and translate them in NextJS
TypeError: Cannot read properties of undefined (reading 'kind')
Azure speech continuous voice recognition from microphone
Microsoft Speech to Text Python SDK SPXERR_INVALID_HEADER issue
How to recognize an audio when i provide a list of more than 4 language in azure using recognize_once()?
SpeechRecognition is not working in firefox