I'm using the IBM Watson speech to text Java SDK, and when I upload the .wav file the response JSON only contains the first transcribed word. When I upload the same file to the web demo, I get the full response.
Very simple implementation with the SDK:
SpeechToText service = new SpeechToText();
service.setUsernameAndPassword("<username>", "<password>");
File audio = new File("src/test/resources/sample1.wav");
SpeechResults transcript = service.recognize(audio, HttpMediaType.AUDIO_WAV);
System.out.println(transcript);
The recognize()
signature you are using will return after the first pause. To see all of the results do this:
RecognizeOptions options = new RecognizeOptions();
options = options.continuous(true)
.contentType(HttpMediaType.AUDIO_WAV)
.interimResults(false)
.inactivityTimeout(10)
.maxAlternatives(1)
.wordConfidence(false)
.timestamps(true)
.model("en-US_BroadbandModel");
SpeechResults transcript = service.recognize(audio, options);
This works for me using the following maven dependency:
<dependency>
<groupId>com.ibm.watson.developer_cloud</groupId>
<artifactId>java-sdk</artifactId>
<version>2.8.0</version>
</dependency>