Search code examples
node.jsffmpeg

How to extract frames in sequence as PNG images from ffmpeg stream?


I'm trying to create a program that would capture my screen (a game to be precise) using ffmpeg and stream frames to NodeJS for live processing. So, if the game runs at 60 fps then I expect ffmpeg to send 60 images per second down to stdout. I've written a code for that

    import { spawn as spawnChildProcess } from 'child_process';

    const videoRecordingProcess = spawnChildProcess(
      ffmpegPath,
      [
        '-init_hw_device',
        'd3d11va',
        '-filter_complex',
        'ddagrab=0,hwdownload,format=bgra',
        '-c:v',
        'png',
        '-f',
        'image2pipe',
        '-loglevel',
        'error',
        '-hide_banner',
        'pipe:',
      ],
      {
        stdio: 'pipe',
      },
    );

    videoRecordingProcess.stderr.on('data', (data) => console.error(data.toString()));

    videoRecordingProcess.stdout.on('data', (data) => {
      fs.promises.writeFile(`/home/goodwin/genshin-repertoire-autoplay/imgs/${Date.now()}.bmp`, data);
    });

Currently I'm streaming those images onto disk for debugging and it's almost working except that the image is cropped. Here's what's going on. I get 4 images saved on disk:

  1. Valid image that is 2560x1440, but only 1/4 or even 1/5 of the screen is present at the top, the remaining part of the image is empty (transparent)
  2. Broken image that won't open
  3. Broken image that won't open
  4. Broken image that won't open

This pattern is nearly consistent. Sometimes it's 3, sometimes 4 or 5 images between valid images. What did I do wrong and how do I fix it? My guess is that ffmpeg is streaming images in chunks, each chunk represents a part of the frame that was already processed by progressive scan. Though I'm not entirely sure if I should try and process it manually. There's gotta be a way to get fully rendered frames in one piece sequentially.


Solution

  • Came up with my own solution to this issue. Since ffmpeg keeps streaming data continiously without having stdout produce end event, there is no way to determine where one image starts and the other one ends. Although we know that output format of the image is PNG, and according to PNG (Portable Network Graphics) Specification, article 12.11. PNG file signature specifically,

    The first eight bytes of a PNG file always contain the following values:

       (decimal)              137  80  78  71  13  10  26  10
       (hexadecimal)           89  50  4e  47  0d  0a  1a  0a
       (ASCII C notation)    \211   P   N   G  \r  \n \032 \n
    

    Therefore we can:

    1. Create a constant with PNG header bytes sequence
    export const PNG_DECIMAL_FILE_SIGNATURE = new Uint8Array([137, 80, 78, 71, 13, 10, 26, 10]);
    
    1. Allocate the empty buffer via Buffer.alloc(0)
    2. Keep pushing data into it until there are two such sequences present
    3. Extract data in between those sequences including the first header bytes sequence

    And it works like a charm! All the images are valid and in full.

    Reading stream and emitting the event to process it:

        const stream = spawnChildProcess(
          ffmpegPath,
          [
            '-init_hw_device',
            'd3d11va',
            '-filter_complex',
            'ddagrab=0,hwdownload,format=bgra',
            '-c:v',
            'png',
            '-f',
            'image2pipe',
            '-loglevel',
            'error',
            '-hide_banner',
            'pipe:',
          ],
          {
            stdio: 'pipe',
          },
        );
    
        stream.stdout.on('data', (data: Buffer) => {
          stream.stdout.pause();
    
          this.eventEmitter.emit('frame-chunk.capture', data, stream.stdout);
        });
    

    Processing the data

    export class FrameProcessorService {
      #currentFrame: Buffer = Buffer.alloc(0);
    
      @OnEvent('frame-chunk.capture')
      async processFrameChunk(frameChunk: Buffer, stream: Readable): Promise<void> {
        this.#currentFrame = Buffer.concat([this.#currentFrame, frameChunk]);
    
        const fullImageBufferIndices = this.extractFullImageBufferIndices(this.#currentFrame);
    
        if (fullImageBufferIndices) {
          const [startIndex, endIndex] = fullImageBufferIndices;
          const fullImage = this.#currentFrame.subarray(startIndex, endIndex + 1); // do something with the final buffer
    
          // constructing new buffer without the finalImage bytes
          const before = this.#currentFrame.subarray(0, startIndex);
          const after = this.#currentFrame.subarray(endIndex + 1);
    
          this.#currentFrame = Buffer.concat([before, after]);
        }
    
        stream.resume();
      }
    
      extractFullImageBufferIndices(buffer: Buffer): readonly [number, number] | null {
        const sequenceLength = PNG_DECIMAL_FILE_SIGNATURE.length;
        const indices = [];
    
        for (let i = 0; i <= buffer.length - sequenceLength; i += 1) {
          if (buffer.subarray(i, i + sequenceLength).equals(PNG_DECIMAL_FILE_SIGNATURE)) {
            indices.push(i);
          }
        }
    
        if (indices.length < 2) {
          return null;
        }
    
        if (indices.length > 2) {
          throw new Error('Possible memory leak detected, please report this to the developer');
        }
    
        return [indices[0], indices[1] - 1];
      }
    }