Search code examples
pythontext-to-speech

Recording synthesized text-to-speech to a file in Python


I am attempting to find a way to take synthesized speech and record it to an audio file. I am currently using pyttsx as my text-to-speech library, but there isn't a mechanism for saving the output to a file, only playing it directly from the speakers. I've looked into detecting and recording audio as well as PyAudio, but these seem to take input from a microphone rather than redirecting outgoing audio to a file. Is there a known way to do this?


Solution

  • You can call espeak with the -w argument using subprocess.

    import subprocess
    
    def textToWav(text,file_name):
       subprocess.call(["espeak", "-w"+file_name+".wav", text])
    
    textToWav('hello world','hello')
    

    This will write file_name.wav without reading out loud. If your text is in a file (e.g. text.txt) you need to call espeak with the -f parameter ("-f"+text). I'd recommend reading the espeak man pages to see all the options you have.

    Hope this helps.