Search code examples
machine-learningcmusphinx

How to dump the phonetic dictionary with espeak


I have been trying to create a grapheme to phoneme dictionary for cmusphinx using espeak but when I choose compile from the menu and choose compile dictionary it says compiled successfully but I can't find the .dic file anywhere.

Please advise on where to find my compiled files.

Thanks in advance


Solution

  • Dictionary compilation is unrelated to phonetic dictionary dump. You need to use -x option instead to display phones for the list of the input words.

    First create a list of words in your language. Then install espeak-ng and run

     echo "tuần" | espeak -v vi -x --sep=" "
    

    It will output the entry for the word:

     tuần t[ w '@2 n _|
    

    You need to clean this entry from special symbols like ' or _ and leave just phonemes, that will be:

     tuần t w @2 n