Search code examples
pythonaudiospectrumlibrosa

How to plot spectrum or frequency vs amplitude of entire audio file using python?


I have some audio files, I want to plot the average spectrum of the audio files like "audacity" software using PYTHON (librosa library). I can see they are plotting average frequency vs amplitude plot of the entire audio.

enter image description here

After that, I want to apply CNN to classify two classes of samples. Looking for suggestions.

Thank you.


Solution

  • Usually you use librosa.display.specshow to plot spectrograms over time, not over the whole file. In fact, as input for your CNN you might rather use a spectrogram over time as produced by librosa.stft or some Mel spectrogram, depending on what your classification goal is.

    E.g., if you want to classify for genre, a Mel-spectrogram may be most appropriate. If you want to find out key or chords, you'll need a Constant-Q-spectrogram (CQT), etc.

    That said, here's some code that answers your question:

    import librosa
    import numpy as np
    import matplotlib.pyplot as plt
    
    
    file = YOUR_FILE
    # load the file
    y, sr = librosa.load(file, sr=44100)
    # short time fourier transform
    # (n_fft and hop length determine frequency/time resolution)
    n_fft = 2048
    S = librosa.stft(y, n_fft=n_fft, hop_length=n_fft//2)
    # convert to db
    # (for your CNN you might want to skip this and rather ensure zero mean and unit variance)
    D = librosa.amplitude_to_db(np.abs(S), ref=np.max)
    # average over file
    D_AVG = np.mean(D, axis=1)
    
    plt.bar(np.arange(D_AVG.shape[0]), D_AVG)
    x_ticks_positions = [n for n in range(0, n_fft // 2, n_fft // 16)]
    x_ticks_labels = [str(sr / 2048 * n) + 'Hz' for n in x_ticks_positions]
    plt.xticks(x_ticks_positions, x_ticks_labels)
    plt.xlabel('Frequency')
    plt.ylabel('dB')
    plt.show()
    

    This leads to this output:

    dB for Frequencies