I have a working video transcription pipeline working using a local OpenAI Whisper model. I would like to use the equivalent distilled model ("distil-small.en"), which is smaller and faster.
transcribe(self):
file = "/path/to/video"
model = whisper.load_model("small.en") # WORKS
model = whisper.load_model("distil-small.en") # DOES NOT WORK
transcript = model.transcribe(word_timestamps=True, audio=file)
print(transcript["text"])
However, I get an error that the model was not found:
RuntimeError: Model distil-small.en not found; available models = ['tiny.en', 'tiny', 'base.en', 'base', 'small.en', 'small', 'medium.en', 'medium', 'large-v1', 'large-v2', 'large-v3', 'large']
I installed my dependencies in Poetry (which used pip under the hood) as follows:
[tool.poetry.dependencies]
python = "^3.11"
openai-whisper = "*"
transformers = "*" # distilled whisper models
accelerate = "*" # distilled whisper models
datasets = { version = "*", extras = ["audio"] } # distilled whisper models
The GitHub Distilled Whisper documentation appears to use a different approach to installing and using these models.
Is it possible to use a Distilled model as a drop-in replacement for a regular Whisper model?
load_model
with a string parameter will only work with OpenAI's known list of models. If you want to use your own model, you will need to download it from the huggingface hub or elsewhere first.
See: https://huggingface.co/distil-whisper/distil-small.en#running-whisper-in-openai-whisper
import torch
from datasets import load_dataset
from huggingface_hub import hf_hub_download
from whisper import load_model, transcribe
distil_small_en = hf_hub_download(repo_id="distil-whisper/distil-small.en", filename="original-model.bin")
model = load_model(distil_small_en)
dataset = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
sample = dataset[0]["audio"]["array"]
sample = torch.from_numpy(sample).float()
pred_out = transcribe(model, audio=sample)
print(pred_out["text"])
You can also see where OpenAI checks the string parameter of load_model
that it only checks the known models (as described in the error you showed)