I am trying to transcribe an audio file which is about 3 min long using SpeechRecognition
, however, it seems to be unable to transcribe anything longer than 20 seconds. This is the code that I'm using:
r = sr.Recognizer()
audio = FLAC(output_name +'.' + output_format)
audio_length = audio.info.length
file = sr.AudioFile(output_name +'.' + output_format)
with file as source:
audio = r.record(source, duration = 20)
google = r.recognize_google(audio, language = 'ru-RU' )
print(google)
How can I loop this so that it transcribes 0s - 20s, then 20s - 40s and so on until the audio file ends?
I would want to avoid splitting the file into separate files of 20s length as much as possible.
So I figured it out. My bad for not reading the documentation of the SpeechRecognition module carefully enough, but they have an offset
parameter!
count = 0
for audio_path in audio_files:
audio = FLAC(audio_list[count] + '.' + output_format) #specify audio file for length calculation
audio_length = audio.info.length #get length of audio file
#n.b. mutagen module used for calculating audio length
number_of_iterations = int(audio_length/20)
if number_of_iterations == 0:
number_of_iterations = 1
file = sr.AudioFile(audio_list[count] + '.' + output_format)
for i in range(number_of_iterations):
with file as source:
audio = r.record(source, offset = i*20, duration = 20)
google = r.recognize_google(audio, language = 'ru-RU' )
count = count + 1
print(google)