Search code examples
ffmpeglibavlibswresample

Proper use of swr_convert_frame() for audio resampling with ffmpeg


Are there any examples for using swr_convert_frame() to resample audio instead of swr_convert()? My code currently looks like (using cgo):

    if averr := C.swr_convert_frame(swrctx, outframe, inframe); averr < 0 {
        return av_err("swr_convert_frame", averr)
    }

    encodeFrame(outFrame)

    if delay := C.swr_get_delay(swrctx, C.int64_t(outframe.sample_rate)); delay > 0 {
        if averr := C.swr_convert_frame(swrctx, outframe, nil); averr < 0 {
            return av_err("swr_convert_frame", averr)
        }

        encodeFrame(outFrame)
    }

However the output frame has more samples than the encoder's configured frame_size for libopus. If I shrink the max nb_samples on the AVFrame, then it passes the encoder but I have to manually set the pts resulting in multiple frames with the same pts, even when following this outline, for example.

I tried setting it according to out_pts = swr_next_pts(ctx, in_pts) however this doesn't seem to correctly calculate the pts and libopus produces some incorrect dts numbers.

Is there an example for correctly using swr_convert_frame that correctly sets the pts for the encoder? Based on the API provided, it seems like it would also produce incomplete frames?


Solution

  • If anyone stumbles upon this question, I finally figured it out. You need to run your own FIFO, swr_convert_frame otherwise won't produce full frames properly. I am doing this with cgo below with bytes.Buffer but it libswresample provides their own audio FIFO.

            delay := int(C.swr_get_delay(c.swrctx, 48000)) + c.resampleFIFO.Len()/int(bytesPerOutputSample) // delay in number of output samples.
            n := C.swr_convert(c.swrctx, (**C.uint8_t)(unsafe.Pointer(&c.resampleBuf)), 8192, &c.avframe.data[0], c.avframe.nb_samples)
            if n < 0 {
                return av_err("swr_convert", n)
            }
            size := C.av_samples_get_buffer_size(nil, 2, n, C.AV_SAMPLE_FMT_S16, 1)
            if size < 0 {
                return av_err("av_samples_get_buffer_size", size)
            }
            if n, err := c.resampleFIFO.Write((*[1 << 30]byte)(unsafe.Pointer(c.resampleBuf))[:size]); err != nil {
                return fmt.Errorf("failed to write to resample buffer: %w", err)
            } else if n != int(size) {
                return fmt.Errorf("failed to write to resample buffer: wrote %d bytes, expected %d", n, size)
            }
    
            samples := 960
            for i := 0; c.resampleFIFO.Len() >= samples*int(bytesPerOutputSample); i += samples {
                c.resampleFIFO.Next(samples * int(bytesPerOutputSample))
            }