Search code examples
c#text-to-speechspeech-synthesismicrosoft-speech-apimicrosoft-speech-platform

What is the difference between System.Speech.Synthesis and Microsoft.Speech.Synthesis?


I am currently developing a small program in C# implementing Text-To-Speech. However, I found out that there are two namespaces which can be used:

  • System.Speech.Synthesis
  • Microsoft.Speech.Synthesis

I googled for the differences and found this post which is about speech recognition. It doesn't really answer my question. I also switched between the two of them and there was no difference. It worked with all the languages in the code (below).

using System;
using System.Speech.Synthesis;
//using Microsoft.Speech.Synthesis;

namespace TTS_TEST
{
class Program
{

    static void Main(string[] args)
    {
          SpeechSynthesizer synth = new SpeechSynthesizer();

          int num;
          string userChoice;

          do
          {
             Console.WriteLine("1 - " + "Microsoft Server Speech Text to Speech Voice (en-US, ZiraPro)");
             Console.WriteLine("2 - " + "Microsoft Server Speech Text to Speech Voice (en-GB, Hazel)");
             Console.WriteLine("3 - " + "Microsoft Server Speech Text to Speech Voice (es-ES, Helena)");
             Console.WriteLine("4 - " + "Microsoft Server Speech Text to Speech Voice (fr-FR, Hortense)");
             Console.WriteLine("5 - " + "Exit");
             Console.Write("Enter the number of your choice: ");     //the user chooses a number
             userChoice = Console.ReadLine();

             if (!Int32.TryParse(userChoice, out num)) continue;

             Console.WriteLine("Choice = " + userChoice);

             if (userChoice == "1")    //Option 1 will use the voice en-US, ZiraPro
             {
                synth.SelectVoice("Microsoft Server Speech Text to Speech Voice (en-US, ZiraPro)");
             }

             if (userChoice == "2")   //Option 2 will use the voice en-GB, Hazel
             {
                synth.SelectVoice("Microsoft Server Speech Text to Speech Voice (en-GB, Hazel)");
             }

             if (userChoice == "3")   //Option 3 will use the voice es-ES, Helena
             {
                synth.SelectVoice("Microsoft Server Speech Text to Speech Voice (es-ES, Helena)");
             }

             if (userChoice == "4")   //Option 4 will use the voice fr-FR, Hortense
             {
                synth.SelectVoice("Microsoft Server Speech Text to Speech Voice (fr-FR, Hortense)");
             }

             if (userChoice == "5")   //Option 5 will exit application
             {
                Environment.Exit(0);
             }

             synth.SetOutputToDefaultAudioDevice();   //set the default audio output

             foreach (InstalledVoice voice in synth.GetInstalledVoices())   //list the installed voices details
             {
                VoiceInfo info = voice.VoiceInfo;

                Console.WriteLine(" Name:          " + info.Name);
                synth.Speak("Name: " + info.Name);
                Console.WriteLine(" Culture:       " + info.Culture);
                synth.Speak("Culture: " + info.Culture);
                Console.WriteLine(" Age:           " + info.Age);
                synth.Speak("Age: " + info.Age);
                Console.WriteLine(" Gender:        " + info.Gender);
                synth.Speak("Gender: " + info.Gender);
                Console.WriteLine(" Description:   " + info.Description);
                Console.WriteLine(" ID:            " + info.Id + "\n");
                synth.Speak("ID: " + info.Id);
             }

             Console.ReadKey();

          }
          while (true);
    }
  }
}

Could somebody explain me the differences between the two of them ?


Solution

  • The difference really is pretty much as outlined in the linked answer; System.Speech.SpeechSynthesis uses the desktop TTS engines, while Microsoft.Speech.SpeechSynthesis uses the server TTS engines. The differences are relatively minor from the programming perspective, but considerably different from the licensing perspective; the server TTS engines are separately licensed.

    However, both System.Speech.SpeechSynthesis and Microsoft.Speech.SpeechSynthesis are deprecated APIs, and new development should be based on the Windows.Media.SpeechSynthesis API.