plot spectrograms/ spectral data in Altair.
I am attempting to plot a df that contains frequency bins as the column titles and time/frames as the index and each cell value is the magnitude of that frequency of that time. How can I encode y-axis with column labels and have altair recognize that as an encoding channel. Perhaps this is trivial task and I'm not conceptualizing a solution or this is a task that altair is not ideally suited in either case any recommendations are appreciated.
The Issue is my frequency bins are my columns in the df(and the column labels would be the y-axis) the index is the time axis (x) and I would use the color channel to encode the values of each cell. How can I pass all the column titles as an encoding for the y axis? Or is their a method/strategu of reorienting the df? spectrogram
I have a matplotlib version (as shown in the image) but Altair offers better interactivity and a couple of additional features that make it a more desirable option. I have a lot of data to process and occasionally passing a visualization of specific objects is necessary.
As per Joel's request here's a small sample of the dataset
Thanks
col_freq= [172.265625, 344.53125, 516.796875, 689.0625]
row_1 = [1610974057651.0325, 1261973870532.6091, \
234137860730.91223, 42549716015.37]
row_2 = [4741489056189.282, 3278778293422.225, \
160494114891.44345, 57040784835.97968]
row_3 = [198776867252.5261, 661886049124.3528, 188309227047.4264, \
124549622810.97015]
data = [row_1, row_2, row_3]
df = pd.DataFrame(data, columns=col_freq)
df
here's the analysis code using scipy.spectrogram:
import pandas as pd
import altair as alt
import scipy as sc
from scipy import fft, signal
from matplotlib import pyplot as plt
# this code produces the full df, for x use any audio.wav file
sample_rate, audio_file =
sc.io.wavfile.read('path/my_audio.wav')
size = 2048
window_func = signal.windows.hann(size)
def wave_to_spect(x, label):
freq, time, Sxx = signal.spectrogram(x, sample_rate,
window=window_func, nperseg=len(window_func))
col_name = [str(x) for x in freq]
df = pd.DataFrame(Sxx, index=col_name)
df = df.T
# visualize in matplotlib
plt.pcolormesh(time, freq, np.log(Sxx), shading='gouraud')
plt.title('Spectrogram) ' + label)
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()
wave_to_spect(audio_file, "audio_car")
With the sample data posted, here is how it can be done:
First melt
your DataFrame
(convert from wide to long format):
df['time'] = df.index
df = pd.melt(df, id_vars=['time'])
df.head()
df.columns = ['time', 'freq', 'value']
df.shape
# (12, 3)
df.head()
# time freq value
#0 0 172.265625 1.610974e+12
#1 1 172.265625 4.741489e+12
#2 2 172.265625 1.987769e+11
#3 0 344.53125 1.261974e+12
#4 1 344.53125 3.278778e+12
Now visualize with altair
using mark_rect()
(use color
channel to encode the values
):
alt.Chart(df).mark_rect().encode(
x='time:O',
y='freq:O',
color='value:Q'
)
to obtain the following figure: