Search code examples
audiosignal-processingfftspectrumonset-detection

Note Onset Detection using Spectral Difference


Im fairly new to onset detection. I read some papers about it and know that when working only with the time-domain, it is possible that there will be a large number of false-positives/negatives, and that it is generally advisable to work with either both the time-domain and frequency-domain or the frequency domain.

Regarding this, I am a bit confused because, I am having trouble on how the spectral energy or the results from the FFT bin can be used to determine note onsets. Because, aren't note onsets represented by sharp peaks in amplitude?

Can someone enlighten me on this? Thank you!


Solution

  • You can look at sharp differences in amplitude at a specific frequency as suspected sound onsets. For instance if a flute switches from playing a G5 to playing a C, there will be a sharp drop in amplitude of the spectrum at around 784 Hz.

    If you don't know what frequency to examine, the magnitude of an FFT vector will give you the amplitude of every frequency over some window in time (with a resolution dependent on the length of the time window). Pick your frequency, or a bunch of frequencies, and diff two FFTs of two different time windows. That might give you something that can be used as part of a likelihood estimate for a sound onset or change somewhere between the two time windows. Sliding the windows or successive approximation of their location in time might help narrow down the time of a suspected note onset or other significant change in the sound.