Search code examples
pythonpython-3.xspeech-recognitionalsa

The speech_recognition module in Linux does not work as it keeps listening and does not advance


The thing is that the speech_recognition module in python3 keeps listening and does not advance further in the code... Here it is: -

import speech_recognition as sr

def takeVoiceInp():
    # Input Voice, Output Text (String)

    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("Listening")
        audio = r.listen(source)
    print("Listened!")
    try:
        print("Recognising!")
        query = r.recognize_google(audio)
        print(f"\033[1m  YOU:  \033[0m {query}\n")

    except Exception:
        print("Try Again!")
        print("Error:", Exception)

        return "None"
    
    return query


print(takeVoiceInp())

When I run this code, it throws this in the console: -

ALSA lib pcm_dsnoop.c:618:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm_route.c:867:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:867:(find_matching_chmap) Found no matching channel map
ALSA lib pcm_route.c:867:(find_matching_chmap) Found no matching channel map
Listening

And that's it! It stays there forever!

I opened Setting and saw that my microphone was being detected by Ubuntu just fine. Morever when I run this program, it also showed ALSA plug-in [python3.6] in the list of applications using my microphone or speaker in the settings (it was the only one using my mic).

What can I do here to get this working as the Listened! sentence is never printed from the above code. If you are able to help, or even if you are reading this, Thanks in advance!


Solution

  • Okay.... so all I did was add r.adjust_for_ambient_noise(source) before the r.listen(source) and it worked just fine... Looks like in Windows, it has some auto or default threshold for ambient noise... Well, now the problem is that the first few seconds of the audio gets glitched, or tampered so, if I start speaking immidiately, it throws the 0 index error(no audio recieved). And if you speak long enough, it doesn't catch on the first few words that you spoke... This is probably because of the ALSA microphone driver that comes by default with Ubuntu/Linux. So a gutsy solution is : -

    from time import sleep
    
    r = sr.Recogniser()
    with sr.Microphone() as source:
        audio = r.listen(source)
    
    sleep(1.5)
    print("Say something...")
    
    query = r.recognise_google(audio)
    

    This fixes it! But... The ALSA problem still sustains... Also, this could be because of my bad microphone quality... Well that's about it! :)