I’m trying to setup Whisper Speech to text, but I'm having some trouble. After running the script I get a Traceback, which does not really give me a clue. At the end I do get:
FileNotFoundError: [WinError 2] The system cannot find the file specified
I tried a number of path combination and checked if the file exists, and installed ffmpeg, but nothing works. I don't a lot of experience with python and this seems to be a familiar problem online, but I have not found a solution so far.
Script:
import os
import whisper
file_path = os.path.normcase(r'jfk.wav')
file_path2 = 'C:/Users/me/Downloads/jfk.wav'
file_path3 = 'C:\\Users\\me\\Downloads\\jfk.wav'
#file_pathpath4 = 'C:\Users\me\Downloads\jfk.wav'
file_path5 = 'jfk.wav'
print("Hello Whisper")
def speech_to_text(audio_file):
if os.path.isfile(audio_file):
print('File exists')
model = whisper.load_model("base")
result = model.transcribe(audio_file, verbose=True)
result['text']
else:
print('File does not exist')
speech_to_text(file_path5)
This is the full traceback:
C:\Users\me\AppData\Roaming\Python\Python39\site-packages\whisper\timing.py:58: NumbaDeprecationWarning: [1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.[0m
def backtrace(trace: np.ndarray):
Hello Whisper
File exists
C:\Users\me\AppData\Roaming\Python\Python39\site-packages\whisper\transcribe.py:114: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Traceback (most recent call last):
File "c:\Users\me\Local\Cookbook\CookbookCpp\EmbedPython\python\Whisper.py", line 21, in <module>
speech_to_text(file_path5)
File "c:\Users\me\Local\Cookbook\CookbookCpp\EmbedPython\python\Whisper.py", line 16, in speech_to_text
result = model.transcribe(audio_file, verbose=True)
File "C:\Users\me\AppData\Roaming\Python\Python39\site-packages\whisper\transcribe.py", line 121, in transcribe
mel = log_mel_spectrogram(audio, padding=N_SAMPLES)
File "C:\Users\me\AppData\Roaming\Python\Python39\site-packages\whisper\audio.py", line 130, in log_mel_spectrogram
audio = load_audio(audio)
File "C:\Users\me\AppData\Roaming\Python\Python39\site-packages\whisper\audio.py", line 46, in load_audio
ffmpeg.input(file, threads=0)
File "C:\Users\me\AppData\Roaming\Python\Python39\site-packages\ffmpeg\_run.py", line 313, in run
process = run_async(
File "C:\Users\me\AppData\Roaming\Python\Python39\site-packages\ffmpeg\_run.py", line 284, in run_async
return subprocess.Popen(
File "C:\Program Files\Python39\lib\subprocess.py", line 951, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Program Files\Python39\lib\subprocess.py", line 1420, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
I had this problem only in my IDE Visual Studio Code, that the audio file will not be found.
While post this, the error no longer occurs, and I can not reproduce it now, also do not know why working now.
But only in my IDE I had this error also:
PS G:\Dropbox\Pyhton> & C:/Python311/python.exe g:/Dropbox/Pyhton/AI/openaiwhisper.py
...
Traceback (most recent call last):
File "C:\Python311\Lib\site-packages\whisper\audio.py", line 48, in load_audio
.run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
In VS-Code I clicked "Run Python File in Dedicated Terminal", maybe after that it worked also in my IDE?
I tried also (to see, that in normal cmd.exe it will work). The audio file will be found on normal cmd.exe (this error do not occur):
whisper C:/text.mp3 --model tiny
also not occur when I start my script via python this:
python openaiwhisper.py
openaiwhisper.py:
import whisper
import os
model = whisper.load_model("tiny")
audiofile = "C:/text.mp3"
fileexists = os.path.isfile(audiofile)
if (fileexists):
result = model.transcribe(audiofile)
print(result["text"])
Maybe it is a problem causing on Windows using a relative path for the audio file, or not restarting terminal.