Search code examples
pythonpyaudio

How to split an audio file to have chunks that are less than a certain dimension in python?


I need to divide an audio file to have chunks that are less than 25mb.

I would like to not have to save the file on the disk.

This is the code I have for now, but it is not working as expected, as it splits the audio in chunks of around 2mb

def audio_splitter(audio_file)
    audio = AudioSegment.from_file(audio_file)

    # Set the chunk size and overlap
    target_chunk_size = 20 * 1024 * 1024  # Target chunk size in bytes (20 MB)
    # Overlap in milliseconds (10 seconds)
    overlap_duration = 10 * 1000
    # Estimate the number of bytes per millisecond in the audio
    bytes_per_ms = len(audio.raw_data) / len(audio)
    # Calculate duration of each chunk in milliseconds
    chunk_duration = int(target_chunk_size / bytes_per_ms)

    chunks = []
    start = 0
    while start < len(audio):
        end = start + chunk_duration
        chunk = audio[start:end]
        chunks.append(chunk)
        start += chunk_duration - overlap_duration

    for i, chunk in enumerate(chunks):
        chunk.export(f"chunk_{i + 1}.mp3", format="mp3")

I think there is a problem with len(audio.raw_data) as it seems not to return the correct byte size.

Is there a better method altogheter to approach this problem?


Solution

  • You can use bytes.IO estimation In my test case I used 1/2 mb limit and files were all exactly 512kb,512kb,426kb,71.1kb

    from pydub import AudioSegment
    import io
    import sys
    import os
    
    
    
    def audio_splitter(audio_file):
        audio = AudioSegment.from_file(audio_file)
    
        test = audio[0:len(audio)]
        test_io = io.BytesIO()
        test.export(test_io, format="mp3")
        test_size = sys.getsizeof(test_io)
    
    
        # Set the chunk size and overlap
        target_chunk_size = 20 * 1024 * 1024  # Target chunk size in bytes (20 MB)
        # Overlap in milliseconds (10 seconds)
        overlap_duration = 10 * 1000
        # Estimate the number of bytes per millisecond in the audio
        bytes_per_ms = test_size/len(audio) # Estimation
        # Calculate duration of each chunk in milliseconds
        chunk_duration = int(target_chunk_size / bytes_per_ms)
    
        chunks = []
        start = 0
        while start < len(audio):
            end = start + chunk_duration
            chunk = audio[start:end]
            chunks.append(chunk)
            start += chunk_duration - overlap_duration
    
        for i, chunk in enumerate(chunks):
            chunk.export(f"chunk_{i + 1}.mp3", format="mp3")
    
    if __name__ == "__main__":
        audio_splitter(*sys.argv[1:])
    
    • in bytes_per_ms I am doing average, you can even do this for every chunk but it will cost you memory,so it depends on your accuracy vs speed criteria, mine is almost accurate for realtime audio(everywhere same byte distribution on average) and fast.