Search code examples
c#winformsaccessibility

Speech Synthesizer "SpeakAsyncCancelAll" runs in user interface thread


I have a Windows forms application that I am trying to add accessibility to and have run into an issue with the speech synthesizer where it appears that the SpeechAsyncCancelAll runs in the user interface thread. Performance is totally dependent on the power of the PC. This can be reproduced with a very simple application in Windows forms. Create a form and add a numeric up down control. Then use this code:

using System.Windows.Forms;
using System.Speech;
using System.Speech.Synthesis;

namespace WindowsFormsApp8
{
    public partial class Form1 : Form
    {
        SpeechSynthesizer _speech = new SpeechSynthesizer();
        public Form1()
        {
            InitializeComponent();
        }


        private void numericUpDown1_ValueChanged(object sender, EventArgs e)
        {
            _speech.SpeakAsyncCancelAll();
            _speech.SpeakAsync(numericUpDown1.Value.ToString());
        }
    }
}

On my development machine which is very powerful it runs without a problem and very fast when you hold down the up arrow. Each value is cancelled so you do not hear anything as the control increments and when you stop pressing the up arrow it announces the last value properly. However, the minute this is run on a lesser PC, even a core i9 hexacore machine, the repeat on the increment slows to a crawl. It looks to me that this is running on the user interface thread. Any suggestions? Thanks


Solution

  • Don't get yourself tricked by the "Async" in the name of the SpeakAsyncCancelAll() method name. As one can see in the source code of the SpeechSynthesizer and VoiceSynthesis classes, there is quite some synchronous code involved in order to communicate with a background thread that does the actual voice synthesis. This code is actually quite heavy in that it uses multiple lock statements.

    A best practice solution for this situation (multiple successive user interactions could create a series of code reactions but in the end we only want the last one) is to not directly start the reaction, but start a timer and only perform the reaction if there was no other user interaction in the meantime.

    public partial class Form1 : Form
    {
        private SpeechSynthesizer _speech = new SpeechSynthesizer();
    
        public Form1()
        {
            InitializeComponent();
            timer1.Interval = 500;
        }
    
        private void numericUpDown1_ValueChanged(object sender, EventArgs e)
        {
            // Reset timer
            timer1.Stop();
            timer1.Start();
        }
    
        private void timer1_Tick(object sender, EventArgs e)
        {
            timer1.Stop();
    
            _speech.SpeakAsyncCancelAll();
            _speech.SpeakAsync(numericUpDown1.Value.ToString());
        }
    }
    

    You should allow the user to configure the timer interval to chose a good compromise based on their system performance and their individual usage patterns. People who need audio assistance often consider for good reasons a too long delay between user activity and an audio response as wasting their time. So it is important that users can configure such a delay to best fit their individual needs.