I have a python speech recognition assistant and it plays mp3 audios it downloads. I have put playing the mp3 on a separate thread in the background.
The issue is that the speech recognition tries to detect what the mps audio is saying and it responds.
How can I make the speech recognition be silent until I give specific speech to wake it up?
Here is my function file for playing and retrieving the mp3:
def play_quran():
speak("Ready to play Quran. Tell me which Surah number you want to hear.")
#qari_num = input("Enter Surah Number: ")
qari_num = recordAudio()
url = ("https://api.quran.com/api/v4/chapter_recitations/9/" + str(qari_num))
print(url)
response = requests.get(url)
my_dictionary = requests.get(url).json()
rdata = response.json()
print(json.dumps(my_dictionary, indent=4))
surah_to_play = (my_dictionary['audio_file']['audio_url'])
print(surah_to_play)
response = request.urlretrieve(surah_to_play, qari_num + ".mp3")
os.system("mpg123 -q " + qari_num + ".mp3")
stop_listening = sr.Recognizer().listen_in_background(sr.Microphone(), recordAudio)
# time.sleep(2)
# exit()
Here is the code that calls the function above:
if "play Quran" in data:
speak("opening Quran. One moment please")
t = threading.Thread(
target=play_quran) # < Note that I did not actually call the function, but instead sent it as a parameter
t.daemon = True
t.start() # < This actually starts the thread execution in the background
Thanks.
TLDR; Google's algorithm goes over audio to be disabled at specific times where it would find the woke word.
Essentially the fancy solution mentioned is the way to go. The process that google uses for example (check the US patent here).
This is all done before the device starts playing the content! as to know when to disable the wake word detection. This algorithm they use is mainly for larger systems, where you would probably disable only one microphone.
So the final product would know timing to turn off the detection of the wake up word based on the played audio from the same device.
if check_if_microphone_enabled(time_variable) and ("play Quran" in data):
There were a few questions related to listening to other applications audio streams, but didn't find a easy solution, as it highly depends on the platform (OS) and software used to playback. The easiest solution that I see if you can control also the playback of the audio (with pygame
or other library), you already have access to the device audio that way.
As you would have access to the audio played (your assistant plays the music, I don't see this to be a problem).