Search code examples
pythonaudioonset-detection

Find timestamps of notes played in wav file


Let's say we have a wav file with some guitar music recorded. The sound is very clean, no extra sounds only guitar itself and possibly metronome ticks.

What would be the best approach to find a timestamp of each note (or a chord) played in Python? I don't need to identify the note itself, only the timestamp when it occurred.

I never did this kind of stuff before so I'm a bit confused. I was reading on Wikipedia about Short-time Fourier transform and it looks kind of promising but I couldn't find any relevant examples. Would really appreciate any help/hints on how to start.


Solution

  • The general problem is called onset detection and there are many methods you can try out. I'll provide a super-naive solution, probably not working for your use case:

    from scipy.io import wavfile
    from scipy.signal import argrelmax
    from matplotlib.mlab import specgram
    
    sr, x = wavfile.read(path)                                    # read in a mono wav file
    spec, freqs, time = specgram(x, NFFT=4096, Fs=sr, mode='psd') # compute power spectral density spectogram
    spec2 = np.diff(spec, axis=1)                                 # discrete difference in each frequency bin
    spec2[spec2<0] = 0                                            # half-wave rectification
    diff = np.sum(spec2, axis=0)                                  # sum positive difference in each time bin
    
    for peak in argrelmax(diff)[0]:                               # find peaks
        print("onset between %f and %f." % (time[peak], time[peak+1]))