Search code examples
iostext-to-speech

iOS TextToSpeech: how to obtain right speech of acronyms and/or Roman numbers for "it-IT" language?


I'm trying to implement a TTS section (in "it-IT" language) into my iOS app, and it is very easy. However, due to input text the AVSynthetizer should "read", I've some problems.

To be more precise, the text contains some acronyms such as "a.C.", "d.C." or Roman numbers such as XI, XV and so on that are spelled in wrong way.

a.C. stands for "avanti Cristo" (before Christ), but is read as "a c". The same happens with "XI" which should be "Undicesimo" (Eleventh) but is read as "XI".

As reference, this is my sample code:

AVSpeechUtterance *utterance = [AVSpeechUtterance speechUtteranceWithString:@"130 a.C."];
AVSpeechSynthesisVoice *voice = [AVSpeechSynthesisVoice voiceWithLanguage:@"it-IT"];

AVSpeechSynthesizer *synth = [[AVSpeechSynthesizer alloc] init];
synth.delegate = self;
utterance.voice = voice;
utterance.rate = 0.20;


[synth speakUtterance:utterance];

Can anyone help me? Thanks.


Solution

  • Text to speech is never perfect.

    I would create an NSString category

    - (NSString *)toItalian;
    

    which takes a string, assumes it is italian text, and converts it to text that AVSpeechUtterance will speak the way you want it.

    In that method, make a mutable copy of self, then replace a.C. with a c for example, XI with Undicesimo and so on until you are happy with the results. It will never be perfect.

    As an alternative, you'd have to edit the source text. If this is say a guide for a museum with limited amount of text that isn't changed too often, it wouldn't be too much work to change the original text.

    (As an anecdote: The Mac text-to-speech conversion recognises Elisabeth II and changes II to "the second". I think that's the only case where it does that. )