Search code examples
language-modelkenlmmake-scorer

Set up kenlm for Windows


The official website makes it pretty clear that there is no support for kenlm in Windows. There is a Windows tag at the github repository but it seems to be maintained by few random contributors then and there.

How to set up kenlm for Windows then?


Solution

  • The solution is to use Ubuntu in Windows through Windows Subsystem for Linux

    1. Get WSL for Windows
    2. From your ubuntu bash navigate to the folder where you want to do the setup. You can access the Windows file system from the /mnt/c/ folder, which you can find at the root directory.
    3. From there simply follow the official instructions, that is clone the git repo, and run cmake .. & make -j2 in order to build the project (after first making the necessary installations in your Ubuntu system).

    Obviously, you must train the models or scorers using the Linux bash. You can also use these models from Windows using the kenlm python library.

    E.g.

    The two steps to build a scorer for the deepspeech-model as described here should be executed from your Ubuntu system. But after you have the scorer you should be able to run the command

    deepspeech --model deepspeech-0.9.3-models.pbmm --scorer kenlm.scorer --audio audio.wav

    from Windows. However, once you have WSL there's no need to do this work from Windows. Things will work nicely @your Ubuntu system.