Search code examples
pythonpyav

Trying to get audio raw data and print it using PyAv


I'm new to PyAv library, or to audio programming in general so I apologize for any mistakes in advance.

I'm trying to load and print raw audio data from an mp4 file

I tried to read the cookbook and google and I got a bit confused.

If I understood correctly, from the frame I'm supposed to get the plane and then decode it but I couldn't figure out how exactly.

Any information regarding the issue would be greatly appreciated.

container = av.open(
    '/Users/ufk/Downloads/1.mp4')

for packet in container.demux():
    for frame in packet.decode():
        if isinstance(frame, av.audio.frame.AudioFrame):
            layout = frame.layout
            channels = layout.channels
            (chl, chr) = channels
            print (frame,
                   frame.format,
                   frame.layout,
                   frame.rate,
                   frame.samples)
            print (chl, chr)
            for plane in frame.planes:
                print(plane)

Solution

  • thanks to Tim roberts help in the comments, i started using numpy arrays, i created an empty array and appended to it in each frame. of course the audio is only a few seconds so it won't eat up the memory, and i plotted the data to make sure that i see it correctly, and it looks good.

    so the code:

    import av
    import numpy as np
    import matplotlib.pyplot as plt
    
    container = av.open(
        '2.mp3')
    
    data = np.empty(shape=0)
    
    for packet in container.demux():
        for frame in packet.decode():
            if isinstance(frame, av.audio.frame.AudioFrame):
                layout = frame.layout
                channels = layout.channels
                (chl, chr) = channels
                print(frame,
                      frame.format,
                      frame.layout,
                      frame.rate,
                      frame.samples)
                print(chl, chr)
                array = frame.to_ndarray()[0]
                data = np.concatenate([data, array])
    
    plt.subplot(2, 1, 1)
    plt.title("Original audio signal")
    plt.plot(data)
    plt.grid()
    plt.tight_layout()
    plt.show()
    

    and the result:

    audio stream