Search code examples
javaspeech-to-textibm-watson

IBM Watson Speech to Text Only Returning First Word With Java SDK


I'm using the IBM Watson speech to text Java SDK, and when I upload the .wav file the response JSON only contains the first transcribed word. When I upload the same file to the web demo, I get the full response.

Very simple implementation with the SDK:

SpeechToText service = new SpeechToText();
service.setUsernameAndPassword("<username>", "<password>");

File audio = new File("src/test/resources/sample1.wav");

SpeechResults transcript = service.recognize(audio, HttpMediaType.AUDIO_WAV);
System.out.println(transcript);

Solution

  • The recognize() signature you are using will return after the first pause. To see all of the results do this:

    RecognizeOptions options = new RecognizeOptions();
    options = options.continuous(true)
              .contentType(HttpMediaType.AUDIO_WAV)
              .interimResults(false)
              .inactivityTimeout(10)
              .maxAlternatives(1)
              .wordConfidence(false)
              .timestamps(true)
              .model("en-US_BroadbandModel");
    SpeechResults transcript = service.recognize(audio, options);
    

    This works for me using the following maven dependency:

    <dependency>
      <groupId>com.ibm.watson.developer_cloud</groupId>
      <artifactId>java-sdk</artifactId>
      <version>2.8.0</version>
    </dependency>