Search code examples
google-cloud-platformgoogle-speech-api

Speech-to-Text: Cannot transcribe long audio files: "google.api_core.future.polling._OperationNotComplete"


I am using Google Speech-to-Text API for transcribing an audio that is 25 mins long. I have used the transcribe_async.py code for such task, as it is meant for long audio files.

I am using Ubuntu 16.04, and Python 3.5.2. The code certainly works on 1 min long audio files.

The error message is shown below. I can't identify the source of the problem.

Waiting for operation to complete...
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/retry.py", line 177, in retry_target
    return target()
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/future/polling.py", line 74, in _done_or_raise
    raise _OperationNotComplete()
google.api_core.future.polling._OperationNotComplete

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/future/polling.py", line 94, in _blocking_poll
    retry_(self._done_or_raise)()
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/retry.py", line 260, in retry_wrapped_func
    on_error=on_error,
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/retry.py", line 195, in retry_target
    last_exc)
  File "<string>", line 3, in raise_from
google.api_core.exceptions.RetryError: Deadline of 90.0s exceeded while calling functools.partial(<bound method PollingFuture._done_or_raise of <google.api_core.operation.Operation object at 0x7f10bdf7bef0>>), last exception:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "transcribe_async.py", line 110, in <module>
    transcribe_gcs(args.path, args.outpath)
  File "transcribe_async.py", line 85, in transcribe_gcs
    response = operation.result(timeout=90)
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/future/polling.py", line 115, in result
    self._blocking_poll(timeout=timeout)
  File "/usr/local/lib/python3.5/dist-packages/google/api_core/future/polling.py", line 97, in _blocking_poll
    'Operation did not complete within the designated '
concurrent.futures._base.TimeoutError: Operation did not complete within the designated timeout.

Solution

  • This issue seems to be generated since the transcribe process requires more than 90 seconds to be performed. I recommend you to try to increase the timeout property to a larger number, depending on the audio file's long, in order to give enough time to the service to execute the transcription.

    Code to be modified (line 81 in transcribe_async.py)

    response = operation.result(timeout=90)