pytorch openai-whisper speaker-diarization

RuntimeError: Library cublas64_12.dll is not found or cannot be loaded. While using WhisperX diarization

I was trying to use whisperx to do speaker diarization. I did it sucessfully on google colab but I'm encountering this error while tyring to transcribe the audio file.

Traceback (most recent call last): File "D:\Programming\Python\Projects\Conversation-Analyser\Conversation Analyser\Classes\diarization.py", line 42, in <module> diarize() File "D:\Programming\Python\Projects\Conversation-Analyser\Conversation Analyser\Classes\diarization.py", line 40, in diarize result = model.transcribe(audio, batch_size=batch_size) File "D:\Programming\Python\Projects\Conversation-Analyser\.venv\lib\site-packages\whisperx\asr.py", line 194, in transcribe language = language or self.detect_language(audio) File "D:\Programming\Python\Projects\Conversation-Analyser\.venv\lib\site-packages\whisperx\asr.py", line 252, in detect_language encoder_output = self.model.encode(segment) File "D:\Programming\Python\Projects\Conversation-Analyser\.venv\lib\site-packages\whisperx\asr.py", line 86, in encode return self.model.encode(features, to_cpu=to_cpu) RuntimeError: Library cublas64_12.dll is not found or cannot be loaded

I'm did pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118

I'm trying to do speaker diarization. While at the transcription phase, I'm encountering this error. THis is the code: model = whisperx.load_model("large-v2", device, compute_type=compute_type, download_root=model_dir) result = model.transcribe(audio, batch_size=batch_size)

Solution

I had the same problem with faster-whisper and after a tremendous time of web search I figured that faster-whisper reimplementation of OpenAI's Whisper model using CTranslate2 and the problem is with CTranslate2 that in the default version needs CUDA 12, check this here:https://github.com/SYSTRAN/faster-whisper

The reason Google Colab works fine is that the CUDA version is 12.2 which contains "cublas64_12.dll", you can check that by "!nvidia-smi" and I use CUDA 11.8 and that is why the "cublas64_12.dll" is missing, CUDA 11 have "cublas64_11.dll".

What I did to solve this problem was downgrade the CTranslate2 version to "3.24.0" by this command:

pip install --upgrade --force-reinstall ctranslate2==3.24.0