Search code examples
pythonpyttsx3

can we copy someone else vocals and use as speak function in python?


hey there I am wondering

import pyttsx3

engine = pyttsx3.init('sapi5')

voices= engine.getProperty('voices') #getting details of current voice

engine.setProperty('voice', voice[0].id)
def speak(audio):

engine.say(audio) 

engine.runAndWait()

in this code engine.setProperty('voice', voice[0].id) this line set a audio for us so it is possible to use our own audio? something from a clip or something like this?


Solution

  • Well, let's have a look at the documentation:

    Supported synthesizers

    Version 2.6 of pyttsx3 includes drivers for the following text-to-speech synthesizers. Only operating systems on which a driver is tested and known to work are listed. The drivers may work on other systems.

    SAPI5 on Windows XP and Windows Vista and Windows 8,8.1 , 10 NSSpeechSynthesizer on Mac OS X 10.5 (Leopard) and 10.6 (Snow Leopard) espeak on Ubuntu Desktop Edition 8.10 (Intrepid), 9.04 (Jaunty), and 9.10 (Karmic)

    The pyttsx3.init() documentation explains how to select a specific synthesizer by name as well as the default for each platform.

    And here:

    engine = pyttsx3.init()
    voices = engine.getProperty('voices')
    for voice in voices:
      engine.setProperty('voice', voice.id)
      engine.say('The quick brown fox jumped over the lazy dog.')
    engine.runAndWait()
    

    So the answer is, no, you can't use your own audio, unfortunately. Speech synthesizers are complex programs, a voice is created from many, many samples and you can't just create a new voice based on one recording.

    Pyttsx3 is a framework, a python wrapper which adapts 3 already existing speech synthesizers for use in Python. What getProperty('voices') does is it gives you a list of supported voices (such as maybe Armenian or Female British English etc) for the synthesizer you have previously selected.

    You can just print the list to get a better idea what voices are supported for your chosen engine.