Let us say I have a poster image, an audio track and a subtitle file (SRT/WebVTT or any other suitable format) on client-side in browser. And I would like to export it to a video.
For e.g., I am looking to create a video like this on client-side (after I have gotten the requisite image, audio and text data): https://twitter.com/yayalexisgay/status/1440779723219423238
I have audio and image data in respective buffers and I have a subtitle file (obtained from, say Google Speech to Text API). How do I create a video like the one in the link above?
I understand that it can be achieved on the backend with a tool like FFmpeg or it's WebAssembly port on the client-side. However, FFmpeg is not an option. Alternatively, I can dump all the streams onto a canvas frame by frame and export canvas stream to a video - however that would require me to actually playing through the entire audio track - which could take some time.
Is there not a way, using browser's in-built codecs to combine individual streams/buffers into a single video?
Thanks!
I think that WebCodecs is what you're looking for. It's the only way to produce a video file faster than realtime with built-in browser APIs.
It's still very new and only available in Chrome (since v94). Firefox intends to support it as well.