I was just getting started with a code to pre-process some audio data in order to lately feed a neural network with it. Before explaining more deeply my actual problem, mention that I took the reference for how to do the project from this site. Also used some code taken from this post and read for more info in the signal.spectogram doc and this post.
For now with all of the sources mentioned before, I managed to get the wav audio file as a numpy array and plot both its amplitude and spectrogram. Theese represent a recording of me saying the word "command" in Spanish.
The strange fact here is that I search on the internet and found that human voice spectrum moves between 80 and 8k Hz, so just to get sure I compared this output with the one Audacity spectrogram returned. As you can see, this seems to be more coherent with the info found, as the frequency range is the one supposed to be for humans.
So that takes me to final question: Am I doing something wrong in the process of reading the audio or generating the spectrogram or maybe am I having plot issues?
By the way I'm new to both python and signal processing so thx in advance for your patience.
Here is the code I'm actually using:
def espectrograma(wav): sample_rate, samples = wavfile.read(wav) frequencies, times, spectrogram = signal.spectrogram(samples, sample_rate, nperseg=320, noverlap=16, scaling='density') #dBS = 10 * np.log10(spectrogram) # convert to dB plt.subplot(2,1,1) plt.plot(samples[0:3100]) plt.subplot(2,1,2) plt.pcolormesh(times, frequencies, spectrogram) plt.imshow(spectrogram,aspect='auto',origin='lower',cmap='rainbow') plt.ylim(0,30) plt.ylabel('Frecuencia [kHz]') plt.xlabel('Fragmento[20ms]') plt.colorbar() plt.show()
The computation of the spectrogram seems fine to me. If you plot the spectrogram in log scale you should observe something more similar to the audition plots you referenced. So uncomment your line
#dBS = 10 * np.log10(spectrogram) # convert to dB
and then use the variable dBS for the plotting instead of spectrogram in
plt.pcolormesh(times, frequencies, spectrogram)
plt.imshow(spectrogram,aspect='auto',origin='lower',cmap='rainbow')