Search code examples
pythonwindowspytorchsoxopennmt

How to install torch audio on Windows 10 conda?


In Anaconda Python 3.6.7 with PyTorch installed, on Windows 10, I do this sequence:

conda install -c conda-forge librosa
conda install -c groakat sox

then in a fresh download from https://github.com/pytorch/audio I do

python setup.py install

and it runs for a while and ends like this:

torchaudio/torch_sox.cpp(3): fatal error C1083: Cannot open include file: 'sox.h': No such file or directory
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.15.26726\\bin\\HostX86\\x64\\cl.exe' failed with exit status 2

I am trying to reproduce this OpenNMT-py speech training demo on Windows: http://opennmt.net/OpenNMT-py/speech2text.html


Solution

  • I managed to compile torchaudio with sox in Windows 10, but is a bit tricky.

    Unfortunately the sox_effects are not usable, this error shows up:

    RuntimeError: Error opening output memstream/temporary file

    But you can use the other torchaudio functionalities.

    The steps I followed for Windows 10 64bit are:

    TORCHAUDIO WINDOWS10 64bit

    Note: I mix some command lines unix-like syntax, you can use file explorer or whatever

    preliminar arrangements

    1. Download sox sources

    $ git clone git://git.code.sf.net/p/sox/code sox

    1. Download other sox source to get lpc10
    $ git clone https://github.com/chirlu/sox/tree/master/lpc10 sox2
    $ cp -R sox2/lpc10 sox
    
    1. IMPORTANT get VisualStudio2019 and BuildTools installed

    lpc10 lib

    4.0. Create a VisualStudio CMake project for lpc10 and build it

    Start window -> open local folder -> sox/lpc10
    (it reads CMakeLists.txt automatically)
    Build->build All
    

    4.2. Copy lpc10.lib to sox

    $ mkdir -p sox/src/out/build/x64-Debug
    $ cp sox/lpc10/out/build/x64-Debug/lpc10.lib sox/src/out/build/x64-Debug
    

    gsm lib

    5.0. Create a CMake project for libgsm and compile it as before with lpc10

    5.1. Copy gsm.lib to sox

    $ mkdir -p sox/src/out/build/x64-Debug
    $ cp sox/libgsm/out/build/x64-Debug/gsm.lib sox/src/out/build/x64-Debug
    

    sox lib

    6.0. Create a CMake project for sox in VS

    6.1. Edit some files:

    CMakeLists.txt: (add at the very beginning)

    project(sox)

    sox_i.h: (add under stdlib.h include line)

    #include <wchar.h> /* For off_t not found in stdio.h */
    #define UINT16_MAX  ((int16_t)-1)
    #define INT32_MAX  ((int32_t)-1)
    

    sox.c: (add under time.h include line)

    `#include <sys/timeb.h>`
    

    6.2. Build sox with VisualStudio

    6.3. Copy the libraries where python will find them, I use a conda environment:

    $ cp sox/src/out/build/x64-Debug/libsox.lib envs\<envname>\libs\sox.lib
    $ cp sox/src/out/build/x64-Debug/gsm.lib envs\<envname>\libs
    $ cp sox/src/out/build/x64-Debug/lpc10.lib envs\<envname>\libs
    

    torchaudio

    $ activate <envname>

    7.0. Download torchaudio from github

    $ git clone https://github.com/pytorch/audio thaudio

    7.1. Update setup.py, after the "else:" statement of "if IS_WHEEL..."

    $ vi thaudio/setup.py

    if IS_WHEEL...

    else:
        audio_path = os.path.dirname(os.path.abspath(__file__))
    
        # Add include path for sox.h, I tried both with the same outcome
        include_dirs += [os.path.join(audio_path, '../sox/src')]
        #include_dirs += [os.path.join(audio_path, 'torchaudio/sox')]
    
        # Add more libraries
    
        #libraries += ['sox']
        libraries += ['sox','gsm','lpc10']
    

    7.2. Edit sox.cpp from torchaudio because dynamic arrays are not allowed:

    $ vi thaudio/torchaudio/torch_sox.cpp
    
     //char* sox_args[max_num_eopts];
     char* sox_args[20]; //Value of MAX_EFFECT_OPTS
    

    7.3. Build and install

    $ cd thaudio
    $ python setup.py install
    

    It will print out tons of warnings about type conversion and some library conflict with MSVCRTD but "works".

    And thats all.