I have a machine with a Quadro P5000 graphics card, running Windows 10. I'd like to train a TTS voice on this system. What do I need to install to make this work?
Here's what to install/do:
cuda
folder into C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1
.git clone https://github.com/coqui-ai/TTS.git
.cd TTS
.python -m venv .
..\Scripts\pip install -e .
..\Scripts\pip install torch==1.8.0+cu101 torchvision==0.9.0+cu101 torchaudio===0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
TTS
folder:import torch
x = torch.rand(5, 3)
print(x)
print(torch.cuda.is_available())
.\Scripts\python ./test_cuda.py
and confirm the output looks like this (the first part should have just random numbers, but the last line must read True
; if it does not, CUDA is not installed properly):tensor([[0.2141, 0.7808, 0.9298],
[0.3107, 0.8569, 0.9562],
[0.2878, 0.7515, 0.5547],
[0.5007, 0.6904, 0.4136],
[0.2443, 0.4158, 0.4245]])
True
TTS
folder, and then customize it for your configuration file:set PYTHONIOENCODING=UTF-8
set PYTHONLEGACYWINDOWSSTDIO=UTF-8
set PHONEMIZER_ESPEAK_PATH=C:/Program Files/eSpeak NG/espeak-ng.exe
.\Scripts\python.exe ./TTS/bin/train_tacotron.py --config_path "C:/path/to/your/config.json"
.\train.bat
.If you are using a different model than Tacotron or need to pass other parameters into the training script, feel free to further customize train.bat
.
If you are just getting started with TTS training in general, take a peek at How do I get started training a custom voice model with Mozilla TTS on Ubuntu 20.04?.