Search code examples
pythonazuretext-to-speechlexicon

Issues with using lexicon on Azure Cognitive services (text-to-speach) from python


I am using Azure cognitive TTS from python for quite some time now, using their examples from the web and it works just fine. I had an issues and had to introduce external lexicons, so I made them and added them to commands. It looks like this:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="en-US">
  <voice name="en-US-JennyNeural">
    <lexicon uri="https://www.something.net/get_lexicons_for_ms/lexicon-test.xml"/>
      <mstts:express-as style="newscast-formal">
        <prosody pitch="+0Hz" rate="+0%">Our CEO has resigned</prosody>
      </mstts:express-as> 
   </voice>
</speak>

And lexicon is described like this:

<lexicon version="1.0" 
         xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
         http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
         alphabet="sapi" xml:lang="en-US">

   <lexeme>
      <grapheme>CEO</grapheme>
      <alias>Chief Executive Officer</alias>
   </lexeme>

   <lexeme> 
      <grapheme>CTO</grapheme>
      <alias>Chief Technology Officer</alias>
   </lexeme>
</lexicon>

I get audio with converted text, I see that Azure is fetching my lexicon from the web, but I don't get correct change of the text as in lexicon.

Am I doing something wrong?


Solution

  • Well, I found an issue. My xml was missing first line:

    <?xml version="1.0" encoding="UTF-8"?>
    

    All works now.