Search code examples
javaibm-cloudspeech-recognitionspeech-to-textibm-watson

How to specify phonetic keywords for IBM Watson speech2text service?


While we have had good success with Bluemix Java SDK in the general case, we've bumped into problems while trying to recognize occasional non-English words (e.g., foreign last names). Our hope was that one could specify the keyword list using SPR phonetic notation (which works great for text2speech), but that does not seem to be supported for speech2text. Any suggestions/workarounds?

SpeechToText service = new SpeechToText();
service.setUsernameAndPassword("USERNAME", "PASSWORD");

File audio = new File("C:\\Users\\AudioFiles\\euler.wav");    

RecognizeOptions options = new RecognizeOptions().Builder()
  .contentType(HttpMediaType.AUDIO_WAV)
  .continuous(true)
  .inactivityTimeout(500)
  .keywords({"Agarwal", "Euler", "Qin"})
  .keywordsThreshold(0.5)
  .build();

  SpeechResults transcript = service.recognize(audio, options);
  System.out.println(transcript);

The objective is to be able say "My name is John Euler." and for the transcript not to return something like "My name is John Oyler." (which is what it does currently).

Thx.


Solution

  • Hmm, the three words that you pass are actually in the vocabulary, but maybe they are not found because they have very little weight in the language model. Have you tried relaxing the threshold? You can also try to use the Watson STT customization service to boost probabilities of names if the task is name focused