x
is a numpy.float32
array, with values from -200
to 0
. These are dB (decibel) values.
When I do (as recommended here):
Image.fromarray(x, mode='F')
I get a greyscale or sometimes nearly black image.
How to map a float in [-200, 0] to a 24-bit RGB byte array (using a colormap) that can be read with the Python module PIL with Image.fromarray(x, mode='RGB')
?
The required .wav audio file is here, for which we want to plot the spectrogram.
Here is some code to test:
import scipy, numpy as np
import scipy.io.wavfile as wavfile
import numpy as np
from PIL import Image
def stft(x, fftsize=1024, overlap=4):
hop = fftsize / overlap
w = scipy.hanning(fftsize+1)[:-1]
return np.array([np.fft.rfft(w*x[i:i+fftsize]) for i in range(0, len(x)-fftsize, hop)])
def dB(ratio):
return 20 * np.log10(ratio+1e-10)
def magnitudedB(frame, fftsize=1024):
w = scipy.hanning(fftsize+1)[:-1]
ref = np.sum(w) / 2
return dB(np.abs(frame) / ref)
sr, x = wavfile.read('test.wav')
x = np.float32(x) / 2**15
s = magnitudedB(stft(x)).astype(np.float32).transpose()[::-1,]
print "Max %.1f dB, Min %.1f dB" % (np.max(s), np.min(s))
im = Image.fromarray(s+200, mode='F')
im.show()
Notes:
The colormap is greyscale, how to get another colormap? like this one
My only requirement is that the output image can be read into a Tkinter frame / canvas (it works well with PIL's im = Image.fromarray(...)
then ImageTk.PhotoImage(image=im)
) or wxPython frame / canvas.
Based on the answer here, you can use matplotlib colormaps to transform the numpy array before converting to an image.
#im = Image.fromarray(s+200, mode='F')
from matplotlib import cm
s = (s + 200)/200.0 # input data should range from 0-1
im = Image.fromarray(cm.jet(s, bytes=True))
im.show()
You should probably set the scaling appropriately based on your min/max values.
Sample output: