I use OpenAI's Whisper python lib for speech recognition. I have some training data: either text only, or audio + corresponding transcription. How can I finetune a model from OpenAI's Whisper ASR on my own training data?
From https://github.com/openai/whisper/discussions/64, the released code doesn't contain the training/finetuning part. Therefore one would have to write it to be able to train/finetune a model from OpenAI's Whisper ASR on my own training data.
Also, from https://openai.com/blog/whisper/:
We are open-sourcing models and inference code to serve as a foundation for building useful applications and for further research on robust speech processing.
No training code mentioned.
William Castrillon and nizata pointed to the following fine-tuning codes created by third-party developers: