Can you configure the way SAPI.spVoice reads text?
In my situation I am reading the current clipboard using an AutoHotKey script. The script makes a COM call to SAPI.spVoice passing the text from the clipboard.
;;;;;;;;;;;;;;;;;;;;TTS;;;;;;;;;;;;;;;;;;;;;;
#^!D:: ; Win + Ctrl + D + Alt
ClipSaved := ClipboardAll
clipboard = ; Start off empty to allow ClipWait to detect when the text has arrived.
Send ^c
ClipWait ; Wait for the clipboard to contain text.
ComObjCreate("SAPI.SpVoice").Speak(clipboard)
Clipboard := ClipSaved
ClipSaved = ; Free the memory
return
The problem is.. that SAPI reads some text incorrectly..
For Example:
You can experiment with this by doing the following:
If you are running windows 7.
So... My question is...
Is it possible to change/configure the way "Microsoft Anna" reads text so it doesn't make these mistakes?
Is this a bug in the Anna voice only or all voices?
How can I make it read the text the way I want it read?
"Every problem (except the problem of too many levels of indirection) can be solved with another level of indirection."
The SAPI.spVoice object can be passed text (as I was doing) or SSML.
By taking the text to be spoken, then converting it to SSML you gain control over how words are spoken. You have a chance to pre-process the text and replace miss-read words with the specific pronunciation you wish.
For example: "Yes it is. Ours is complex." becomes "Yes it <sub alias="is">is</sub>. Ours is complex.
"
sub
and say_as
seem to work. phoneme
seem to be ignored.. but I may have something configured wrongly.
Note: If you want XML read aloud, XML escape the text before converting it to SSML, otherwise it will be assumed to be part of the SSML.
So.. in code:
;;;;;;;;;;;;;;;;;;;;TTS;;;;;;;;;;;;;;;;;;;;;;
#^D:: ; Win + Ctrl + D
ClipSaved := ClipboardAll
Clipboard = ; Start off empty to allow ClipWait to detect when the text has arrived.
Send ^c
ClipWait ; Wait for the clipboard to contain text.
FileDelete , c:\tmp\tmp_ahk_tts_clip.txt
FileAppend , %Clipboard% , c:\tmp\tmp_ahk_tts_clip.txt
RunWait, %comspec% /c ""F:\bin\tools\speakit.rb" c:\tmp\tmp_ahk_tts_clip.txt > c:\tmp\tmp_ahk_clip_tts_out.txt" ,,Hide
FileRead, Clipboard, c:\tmp\tmp_ahk_clip_tts_out.txt
ComObjCreate("SAPI.SpVoice").Speak(Clipboard)
Clipboard := ClipSaved
ClipSaved = ; Free the memory
return
and F:\bin\tools\speakit.rb
is sometihng like this:
#!/usr/bin/env ruby
substitutions = [
[/[A-Z][A-Z][A-Z][A-Z]+((?=[^A-Za-z])|(?!.))/, lambda{|x|x.downcase}], #All caps becomes word
[/\.exe(?=[^a-z])/i, " executable "],
[/\.txt(?=[^a-z])/i, " text file "],
[/rebranded/, "re-branded"],
[/App(?=[\s\.])/, " application "],
['GUI' , " gooee "],
[/localhost/, "local host"],
[/(?<word>[A-Z][a-z]*)(?=[A-Z ,\.;:\t\/])/, "'\\k<word>' "], # CamelCaseWords should be split by spaces
['\\', '<sub alias="slash">\\</sub>'],
]
require 'cgi'
puts <<-eos
<?xml version="1.0"?>
<speak xmlns="http://www.w3.org/2001/10/synthesis" version="1.0" xml:lang="en-UK">
<voice xml:lang="en-UK">
#{substitutions.reduce(CGI::escapeHTML(ARGF.read)){|o, (r,s)| s.is_a?(Proc) ? o.gsub(r, &s) : o.gsub(r,s) }}
</voice>
</speak>
eos