Search code examples
javascripthtml5-canvasweb-audio-apicodecwebapi

Bulk exporting image, audio and text streams to a video in browser


Let us say I have a poster image, an audio track and a subtitle file (SRT/WebVTT or any other suitable format) on client-side in browser. And I would like to export it to a video.

For e.g., I am looking to create a video like this on client-side (after I have gotten the requisite image, audio and text data): https://twitter.com/yayalexisgay/status/1440779723219423238

I have audio and image data in respective buffers and I have a subtitle file (obtained from, say Google Speech to Text API). How do I create a video like the one in the link above?

I understand that it can be achieved on the backend with a tool like FFmpeg or it's WebAssembly port on the client-side. However, FFmpeg is not an option. Alternatively, I can dump all the streams onto a canvas frame by frame and export canvas stream to a video - however that would require me to actually playing through the entire audio track - which could take some time.

Is there not a way, using browser's in-built codecs to combine individual streams/buffers into a single video?

Thanks!


Solution

  • I think that WebCodecs is what you're looking for. It's the only way to produce a video file faster than realtime with built-in browser APIs.

    It's still very new and only available in Chrome (since v94). Firefox intends to support it as well.