Search code examples
python-2.7speech-to-textwatson

Why do I get the same result, without matter if I change the file?


Hello I am trying to use the following sdk:

https://github.com/watson-developer-cloud/python-sdk/blob/master/examples/speech_to_text_v1.py

In order to get the text transcription of a wav file as follows:

import json
from os.path import join, dirname
from watson_developer_cloud import SpeechToTextV1


speech_to_text = SpeechToTextV1(
    username='XXXXXXXXX',
    password='XXXXXXXXX',
    x_watson_learning_opt_out=False
)
print(json.dumps(speech_to_text.models(), indent=2))
print('I am using the spanish model for this test')
print(json.dumps(speech_to_text.get_model('es-ES_NarrowbandModel'), indent=2))

with open(join(dirname(__file__), '/Users/Downloads/python-sdk-master/examples/test.wav'),
          'rb') as audio_file:
    print(json.dumps(speech_to_text.recognize(
        audio_file, content_type='audio/wav', timestamps=True,
        word_confidence=True),
        indent=2))

The problem is that every time that I run the request as follows:

python speech.py 

I am getting the same result and it does not matter if I change the name of the file in the parameters I always get:

I am using the spanish model for this test
{
  "name": "es-ES_NarrowbandModel", 
  "language": "es-ES", 
  "sessions": "https://stream.watsonplatform.net/speech-to-text/api/v1/sessions?model=es-ES_NarrowbandModel", 
  "url": "https://stream.watsonplatform.net/speech-to-text/api/v1/models/es-ES_NarrowbandModel", 
  "rate": 8000, 
  "supported_features": {
    "custom_language_model": false, 
    "speaker_labels": true
  }, 
  "description": "Spanish narrowband model."
}
{
  "results": [
    {
      "alternatives": [
        {
          "word_confidence": [
            [
              "yeah", 
              0.361
            ], 
            [
              "and", 
              0.867
            ], 
            [
              "on", 
              0.448
            ], 
            [
              "the", 
              0.243
            ], 
            [
              "loss", 
              0.172
            ], 
            [
              "of", 
              0.68
            ], 
            [
              "my", 
              0.953
            ], 
            [
              "honor", 
              0.131
            ], 
            [
              "and", 
              0.12
            ], 
            [
              "sometimes", 
              0.23
            ], 
            [
              "platter", 
              0.659
            ], 
            [
              "and", 
              0.339
            ], 
            [
              "also", 
              0.337
            ], 
            [
              "got", 
              0.227
            ], 
            [
              "asking", 
              0.383
            ], 
            [
              "about", 
              0.1
            ], 
            [
              "someone", 
              0.571
            ], 
            [
              "economies", 
              0.144
            ], 
            [
              "on", 
              0.146
            ], 
            [
              "both", 
              0.093
            ]
          ], 
          "confidence": 0.368, 
          "transcript": "yeah and on the loss of my honor and sometimes platter and also got asking about someone economies on both ", 
          "timestamps": [
            [
              "yeah", 
              0.18, 
              0.47
            ], 
            [
              "and", 
              0.72, 
              1.28
            ], 
            [
              "on", 
              1.28, 
              1.41
            ], 
            [
              "the", 
              1.41, 
              1.48
            ], 
            [
              "loss", 
              1.48, 
              1.78
            ], 
            [
              "of", 
              1.78, 
              1.89
            ], 
            [
              "my", 
              1.89, 
              2.04
            ], 
            [
              "honor", 
              2.04, 
              2.37
            ], 
            [
              "and", 
              2.37, 
              2.53
            ], 
            [
              "sometimes", 
              2.56, 
              3.17
            ], 
            [
              "platter", 
              3.17, 
              3.53
            ], 
            [
              "and", 
              4.04, 
              4.17
            ], 
            [
              "also", 
              4.17, 
              4.45
            ], 
            [
              "got", 
              4.45, 
              4.63
            ], 
            [
              "asking", 
              4.63, 
              4.97
            ], 
            [
              "about", 
              4.97, 
              5.18
            ], 
            [
              "someone", 
              5.18, 
              5.45
            ], 
            [
              "economies", 
              5.45, 
              5.97
            ], 
            [
              "on", 
              5.97, 
              6.12
            ], 
            [
              "both", 
              6.12, 
              6.34
            ]
          ]
        }
      ], 
      "final": true
    }, 
    {
      "alternatives": [
        {
          "word_confidence": [
            [
              "even", 
              0.547
            ], 
            [
              "in", 
              0.586
            ], 
            [
              "the", 
              0.766
            ], 
            [
              "planet", 
              0.276
            ], 
            [
              "of", 
              0.131
            ], 
            [
              "my", 
              0.188
            ], 
            [

So I would like to appreciate support to overcome this task,


Solution

  • You shouldn't need the join(dirname(__file__), if you're specifying a fully qualified file path. If the path of your desired audio file isn't in a relative path, I'd try removing the join, although if that's the problem I'd expect a file not found error.