flutter google-cloud-platform speech-to-text

flutter:: Can I use speech to text api without pronunciation correction?

I am making an application using google cloud speech to text api with flutter. As a result of using the google speech to text api, I felt that this api does not convert the exact pronunciation into text, but corrects the pronunciation and converts it to text.

For example, if I pronounce 'opple', the text is automatically converted to 'apple'. I want the text as 'opple'.

Is there any way to use the speech to text api without a function to correct pronunciation?

Solution

There is no option to use Speech-to-Text API without pronunciation correction. Speech-to-Text API tries to identify known words when it is transcribing the audio into text. Using words that don't exist such as [Opple, Epple, Ipple, Upple] will result in words that are similar to what was said like Apple. Unless you are using a different language where any of those words exists, the API will autocorrect the pronunciation.

As a workaround, you can use the speech adaptation feature to help the Speech-to-Text recognize specific words or phrases more frequently than other options that might otherwise be suggested. For example, suppose that your audio data often includes the word "weather". When Speech-to-Text encounters the word "weather", you want it to transcribe the word as "weather" more often than "whether". In this case, you might use speech adaptation to bias Speech-to-Text toward recognizing "weather". To increase the probability that Speech-to-Text recognizes the word "weather" when it transcribes your audio data, pass "weather" in the phrases field of a SpeechContext object. Assign the SpeechContext object to the speechContexts field of the RecognitionConfig object in your request to the Speech-to-Text API. The following snippet shows part of a JSON payload sent to the Speech-to-Text API. The JSON snippet provides the word "weather" for speech adaptation. Please see this doc for more information.

"config": {
    "encoding":"LINEAR16",
    "sampleRateHertz": 8000,
    "languageCode":"en-US",
    "speechContexts": [{
      "phrases": ["weather"]
    }]
}

By default, speech adaptation provides a relatively small effect, especially for one-word phrases. The speech adaptation boost feature allows you to increase the recognition model bias by assigning more weight to some phrases than others to the strength of the speech adaptation effects on your transcription results (i.e) a higher boost value gives more importance to the specified phrases. The following snippet shows an example of a JSON payload. The JSON snippet includes a RecognitionConfig object that uses boost values to weight the words "fare" and "fair" differently. Also, note that “Speech adaptation boost” is a Beta feature. For more information, refer to this doc.

"config": {
    "encoding":"LINEAR16",
    "sampleRateHertz": 8000,
    "languageCode":"en-US",
    "speechContexts": [{
      "phrases": ["fare"],
      "boost": 18
     }, {
      "phrases": ["fair"],
      "boost": 2
     }]
  }