Search code examples
pythontext-to-speechibm-watsonwatson-text-to-speech

IBM Watson text to speech audio file not being able to be played after synthesizing


What I'm doing is writing to the audio output file, waiting until the file exists and the size isn't 0, then playing it (I have tried many different libraries such as subprocess, playsound, pygame, vlc, etc. I have also tried many different file types mp3, wav, etc) but for some reason, I am getting an error saying it isn't closing or is corrupted. Once in a while it plays once but as soon as another watson made mp3 is played it errors again. Does anyone know a solution?

...
from ibm_watson import TextToSpeechV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
...
authenticator = IAMAuthenticator(ibmApiKey);
textToSpeech = TextToSpeechV1(authenticator = authenticator);
textToSpeech.set_service_url(ibmServiceUrl);
...
file = str(int(random.random() * 100000)) + ".mp3";
    with open(file, "wb") as audioFile:
        audioFile.write(textToSpeech.synthesize(text, voice = "en-GB_JamesV3Voice", accept = "audio/mp3").get_result().content);

    fileExists = False;

    while (fileExists == False):
        if (os.path.isfile(file)):
            fileExists = os.stat(file).st_size != 0;
            playsound(file);
            os.remove(file);
Error 263 for command:
        open temp/77451.mp3
    The specified device is not open or is not recognized by MCI.

    Error 263 for command:
        close temp/77451.mp3
    The specified device is not open or is not recognized by MCI.
Failed to close the file: temp/77451.mp3
Traceback (most recent call last):
  File "main.py", line 457, in <module>
    runMain(name, config.get("main", "callName"), voice);
  File "main.py", line 156, in runMain
    speak("The time is: " + datetime.now().strptime(datetime.now().time().strftime("%H:%M"), "%H:%M").strftime("%I:%M %p"), voice);
  File "main.py", line 123, in speak
    playsound(file);
  File "C:\Users\turtsis\AppData\Local\Programs\Python\Python35-32\lib\site-packages\playsound.py", line 72, in _playsoundWin
    winCommand(u'open {}'.format(sound))
  File "C:\Users\turtsis\AppData\Local\Programs\Python\Python35-32\lib\site-packages\playsound.py", line 64, in winCommand
    raise PlaysoundException(exceptionMessage)
playsound.PlaysoundException:
    Error 263 for command:
        open temp/77451.mp3
    The specified device is not open or is not recognized by MCI.

Solution

  • The bug could reside in various places.

    First I would try this:

    from ibm_watson import ApiException
    try:
        file = str(int(random.random() * 100000)) + ".mp3";
            with open(file, "wb") as audioFile:
                audioFile.write(textToSpeech.synthesize(text, voice = "en-GB_JamesV3Voice", accept = "audio/mp3").get_result().content);
    except ApiException as ex:
        print ("Method failed with status code " + str(ex.code) + ": " + ex.message)
    

    If the call to Watson returns an error, it could be ejecting you out of your runtime.

    However, if the issue is with playsound, I would suggest this route:

    import pyttsx3
    from ibm_watson import ApiException
    
    engine = pyttsx3.init()
    try:
        file = str(int(random.random() * 100000)) + ".mp3";
            with open(file, "wb") as audioFile:
                audioFile.write(textToSpeech.synthesize(text, voice = "en-GB_JamesV3Voice", accept = "audio/mp3").get_result().content);
    
            fileExists = False;
    
            while (fileExists == False):
                if (os.path.isfile(file)):
                    fileExists = os.stat(file).st_size != 0;
                    engine.say(file);
                    os.remove(file); 
                    engine.runAndWait()          
    
    except ApiException as ex:
        print ("Method failed with status code " + str(ex.code) + ": " + ex.message)
    

    If neither of those work, I would try using curl and see if you can replicate your scenario:

    Replace {apikey} and {url} with your API key and URL.
    
    
    curl -X POST -u "apikey:{apikey}" --header "Content-Type: application/json" --data "{\"text\":\"hello world\"}" --output hello_world.ogg "{url}/v1/synthesize?voice=en-US_AllisonV3Voice"
    

    Best of luck.