Search code examples
pythonfastapiopenai-apiopenai-whisper

OpenAI Whisper API (InvalidRequestError)


I'm trying to use OpenAI Whisper API to transcribe my audio files. When I run it by opening my local audio files from disk, it worked perfectly. Now I'm developing a FastAPI endpoint to receive an audio file from the client and transcribe it.

However, when I try to use the same file received by FastAPI endpoint directly, it will reject the file, claiming the file received is in invalid format.

I tried to read and write the received file to the disk directly from the endpoint. Then opening the file from disk and using it in Whisper API, it works without any issues. Below is the code that shows it.

@app.post("/audio")
async def summarise_audio(file:UploadFile):
    audio =await file.read()

    with open("testy.wav",'wb') as f:
        f.write(audio)
    x = open("testy.wav",'rb')
    transcript = openai.Audio.transcribe("whisper-1",x) # worked
    # transcript = openai.Audio.transcribe("whisper-1",file.file) # did not work 
    return transcript

How would I go to solve this problem, could there be an issue with the file format received by FastAPI endpoint?


Solution

  • Okay, after spending about 12 hours on this problem, I found a workaround for OpenAI Whisper API for it to accept the file.

    Granted I am not well versed in file reading and binary content, so if anyone has better solution than me, I would love to see the solution.

    import io
    @app.post("/audio")
    async def summarise_audio(file:UploadFile):
        audio =await file.read()
        
        buffer = io.BytesIO(audio)
    
        buffer.name = 'testy.wav'
        transcript = openai.Audio.transcribe("whisper-1",buffer) # worked
        
        return transcript
    

    I have to read the file content and then convert it into a file-like buffer using io.BytesIO. Here, passing in the buffer directly to OpenAI Whisper API would not work as the buffer does not have a file name. So we have to specify a name for the buffer before passing it into the OpenAI Whisper API.