Search code examples
videovideo-processingavisynth

Programatically add 100s of image overlays on video clip


I'm looking for a programmatic video editing solution which could provide API for adding image and text overlays in specific times/frames at specific coordinates on a video (1080p) clip, as well as resizing to 720p etc.

I tried AviSynth but got blocked after ~400 overlays in total because of "Out of Memory error" - see AviSynth Out of Memory Error (100s of image overlays)

Is there anything else I could try (sample code would be awesome)?


Solution

  • You can always go with a commercial solution, which I recommend, such as Adobe After Effects which has an API you can control using JavaScript (jsx files).

    Commercial video and compositing solutions are typically more robust and has better caching and buffering capability than free/open sources alternatives which means they can potentially add more layers to the composition without running out of memory.

    (Just as a side note: I am pointing you to After Effects here event though it is composition oriented. Adobe Premier (or another non-linear editor) would be a more natural choice for simple image and text overlays but it does not have a scripting interface (AFAIK and the are no resources listed for this at Adobe's site). However, it is also possible to create long sequences with AE and you can do more with the elements you add to a scene.)

    If you are already familiar with JavaScript then it is (obviously) just a matter of reading up on the API documentation for its objects, methods and properties and so forth (I added link to documentation below).

    Adobe has also its own JavaScript editor (but isn't required) which can be found at these locations:

    Mac OS X:

    /Applications/Utilities/Adobe Utilities CS6/ExtendScript Toolkit CS6/
    

    Windows:

    C:\Program Files\Adobe\Adobe Utilities - CS6\ExtendScript Toolkit CS6
    

    The following example taken from this site creates a comp and then adds a text layer to it (go to site for full script):

    // create project if necessary
    
    var proj = app.project;
    if(!proj) proj = app.newProject();
    
    // create new comp named 'my text comp'
    
    var compW = 160; // comp width
    var compH = 120; // comp height
    var compL = 15;  // comp length (seconds)
    var compRate = 24; // comp frame rate
    var compBG = [48/255,63/255,84/255] // comp background color
    
    var myItemCollection = app.project.items;
    var myComp = myItemCollection.addComp('my text',compW,compH,1,compL,compRate);
    
    myComp.bgColor = compBG;
    

    and then add a text layer:

    var text;
    while (!myFile.eof){
        text = myFile.readln();
        if (text == "") text = "\r" ;
        myComp.layers.addText(text);
    }
    

    You can also control Photoshop/Illustrator with JavaScript/jsx files so here you can make powerful combinations/effects etc. (which reminds me of good old AREXX :-) ).

    There are similar APIs for solutions such as Flame (and Combustion which is no longer available after Auto-desk purchased it) which uses Python, but the price range here is relative high.

    If commercial variants aren't an option then you can look into Blender which also provides an API for Python.

    But note that Blender is primarily 3D oriented but can be used for video compositing as well.

    An example taken from this page will write text to screen:

    def write():
        """write on screen"""
        width = render.getWindowWidth()
        height = render.getWindowHeight()
    
        # OpenGL setup
        bgl.glMatrixMode(bgl.GL_PROJECTION)
        bgl.glLoadIdentity()
        bgl.gluOrtho2D(0, width, 0, height)
        bgl.glMatrixMode(bgl.GL_MODELVIEW)
        bgl.glLoadIdentity()
    
        # BLF drawing routine
        font_id = logic.font_id
        blf.position(font_id, (width * 0.2), (height * 0.3), 0)
        blf.size(font_id, 50, 72)
        blf.draw(font_id, "Hello World")
    

    And of course you can always script programs such as FFmpeg which is in itself quite powerful and flexible.

    You can script it by adding arguments to it at command line, for example (taken from here):

    Show a text line sliding from right to left in the last row of the video frame. The file ‘LONG_LINE’ is assumed to contain a single line with no newlines.

    drawtext="fontsize=15:fontfile=FreeSerif.ttf:text=LONG_LINE:y=h-line_h:x=-50*t"
    

    You would simply put this into a batch file of some sort and run it. The limitations are of course that you need to do a bit trial-and-error to get text and images to appear exactly where you want it to be.

    I have never tried hundreds of layers and I doubt it has built-in buffer/caching to handle large amount of layers, but it can be worth a try since it is both free and powerful otherwise.

    Resources: