I'm trying to transcribe a German podcast which I have both on my pc and my Google Storage bucket. I'm using this tutorial as a reference.
Here's my code:
frame_rate, channels = frame_rate_channel('pod.wav')
gcs_uri = 'gs://callsaudiofiles21/pod.wav'
client = speech.SpeechClient()
audio = types.RecognitionAudio(uri=gcs_uri)
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=frame_rate,
language_code='de-DE')
transcript = ''
operation = client.long_running_recognize(config, audio)
response = operation.result(timeout=10000)
for result in response.results:
transcript += result.alternatives[0].transcript
But it stops at the operation
line, outputting TypeError: long_running_recognize() takes from 1 to 2 positional arguments but 3 were given
. The tutorial is from a year ago, so something must have changed in the API since. I'm not sure what to modify though.
Looks like you're using an old library version.
From Google async recognizion example, this two options seems to be equivalent:
operation = client.long_running_recognize(
request={"config": config, "audio": audio}
)
or
operation = client.long_running_recognize(config=config, audio=audio)
BTW - Take a look also at the official Google Codelab for Speech to text - they always have up-to-date examples.