I have captured a input sound signal with my microphone and visualized it in a OpenCV Mat:
I read each 20 samples (=each 20 pixels in x-direction of the "sound-mat") and multiply it with the hann-window-function. Then, I perform the dft (decrete fourier transform) in OpenCV (docs here) of this windowed sequence. Here is an example of the mangitude output of the dft of such a 20-sample signal:
But how can I get a frequency spectrogram? Is the described attempt right? What do I have to do with these dft outputs to get a spectrogram?
Sorry for not posting the pictures; only links. As I am new to stackoverflow, I cannot post pictures directly.
This doesn't work with such an image. You have to use a 1D vector (1D Mat in OpenCV doesn't exist) that has as size the length of your audio signal.
Than you have to do the dft/fft on a windowed (e.g. hann-window) section of the sound. Do this for every section, so that you get the frequency of each such a section. The output can be put together to a spectrogram.