Search code examples
youtube-apivideo-streaminghttp-live-streamingrtmpyoutube-livestreaming-api

Architecture for a web app to add overlays to users' Youtube live stream video?


I am trying to build a web app for users to easily add text (as open caption) and other assets in my app as overlays in real-time to their YouTube live stream video.

They will use their camera to record their video, and select from my app which text should be added to the video.

Then, the video will be sent to Youtube live through their API.

Here are my questions:

First of all, I was wondering if mixing video + subtitle and sending it to Youtube's rtmp url can be done from the client side, so it's simple and lightweight.

Second, should I encode the output being sent to Youtube? Can this be done from the browser too? I'm only seeing a few node.js frameworks, and even they're not very mature (or is Webcodecs for this purpose?). Is a web app a poor choice for this task?

Lastly, if I do need a server to process the video, where should the encoding happen (from the user's machine, or in the server, or both?)? Is my server most likely going to be the bottleneck given YouTube's infrastructure, since video files are huge and my server is limited?

I am new to video streaming, so please excuse my lack of understanding of the subject. Also, if there's any good resource for my problem, please share them with me.


Solution

  • First of all, I was wondering if mixing video + subtitle and sending it to Youtube's rtmp url can be done from the client side, so it's simple and lightweight.

    You can do the video compositing and audio mixing and what not, but browsers don't support RTMP. To get the data to an RTMP server, you need to send it to a server where it is proxied off to the final URL.

    They will use their camera to record their video, and select from my app which text should be added to the video.

    Yeah, that's no problem at all. Draw everything to a canvas every frame.

    Second, should I encode the output being sent to Youtube?

    Yes, you must. Check out the Media Recorder API.

    Lastly, if I do need a server to process the video, where should the encoding happen (from the user's machine, or in the server, or both?)?

    The video has to be encoded client-side to get to the server in the first place. The server can then hopefully just repackage with flv and send it along. If the browser doesn't support H.264 in its Media Recorder API, then you'll have an intermediary codec like VP8, and you'll have to transcode server-side.

    A few years ago, I wrote a tutorial on how to do all of these steps here: https://github.com/fbsamples/Canvas-Streaming-Example Note that the tutorial is in the context of Facebook, but this should teach you the concepts.