Search code examples
pythonnumpyaudioformat

How can I convert a .wav to .mp3 in-memory?


I have a numpy array from a some.npy file that contains data of an audio file that is encoded in the .wav format.

The some.npy was created with sig = librosa.load(some_wav_file, sr=22050) and np.save('some.npy', sig).
I want to convert this numpy array as if its content was encoded with .mp3 instead.

Unfortunately, I am restricted to the use of in-memory file objects for two reasons.

  1. I have many .npy files. They are cached in advance and it would be highly inefficient to have that much "real" I/O when actually running the application.
  2. Conflicting access rights of people who are executing the application on a server.

First, I was looking for a way to convert the data in the numpy array directly, but there seems to be no library function. So is there a simple way to achieve this with in-memory file objects?

NOTE: I found this question How to convert MP3 to WAV in Python and its solution could be theoretically adapted but this is not in-memory.


Solution

  • I finally found a working solution. This is what I wanted.

    from pydub import AudioSegment
    wav = np.load('some.npy')
    with io.BytesIO() as inmemoryfile:
            compression_format = 'mp3'
            n_channels = 2 if wav.shape[0] == 2 else 1 # stereo and mono files
            AudioSegment(wav.tobytes(), frame_rate=my_sample_rate, sample_width=wav.dtype.itemsize,
                         channels=n_channels).export(inmemoryfile, format=compression_format)
            wav = np.array(AudioSegment.from_file_using_temporary_files(inmemoryfile)
                           .get_array_of_samples())
    

    There exists a wrapper package (audiosegment) with which one could convert the last line to:

    wav = audiosegment.AudioSegment.to_numpy_array(AudioSegment.from_file_using_temporary_files(inmemoryfile))