Search code examples
pythonvisualizationaltairspectrogram

visualize spectral data in altair


plot spectrograms/ spectral data in Altair.

I am attempting to plot a df that contains frequency bins as the column titles and time/frames as the index and each cell value is the magnitude of that frequency of that time. How can I encode y-axis with column labels and have altair recognize that as an encoding channel. Perhaps this is trivial task and I'm not conceptualizing a solution or this is a task that altair is not ideally suited in either case any recommendations are appreciated.

The Issue is my frequency bins are my columns in the df(and the column labels would be the y-axis) the index is the time axis (x) and I would use the color channel to encode the values of each cell. How can I pass all the column titles as an encoding for the y axis? Or is their a method/strategu of reorienting the df? spectrogram

I have a matplotlib version (as shown in the image) but Altair offers better interactivity and a couple of additional features that make it a more desirable option. I have a lot of data to process and occasionally passing a visualization of specific objects is necessary.

As per Joel's request here's a small sample of the dataset

Thanks

col_freq= [172.265625, 344.53125, 516.796875, 689.0625]

row_1 = [1610974057651.0325, 1261973870532.6091, \ 
234137860730.91223, 42549716015.37]
row_2 = [4741489056189.282, 3278778293422.225, \
160494114891.44345, 57040784835.97968]
row_3 = [198776867252.5261, 661886049124.3528, 188309227047.4264, \
124549622810.97015]

data = [row_1, row_2, row_3]

df = pd.DataFrame(data, columns=col_freq)
df

here's the analysis code using scipy.spectrogram:

import pandas as pd
import altair as alt
import scipy as sc
from scipy import fft, signal 
from matplotlib import pyplot as plt
# this code produces the full df,  for x use any audio.wav file
sample_rate, audio_file = 
sc.io.wavfile.read('path/my_audio.wav')
size = 2048
window_func = signal.windows.hann(size)

def wave_to_spect(x, label):
    freq, time, Sxx = signal.spectrogram(x, sample_rate, 
       window=window_func, nperseg=len(window_func))
    col_name = [str(x) for x in freq]    
    df = pd.DataFrame(Sxx, index=col_name)
    df = df.T
    # visualize in matplotlib
    plt.pcolormesh(time, freq, np.log(Sxx), shading='gouraud')
    plt.title('Spectrogram) ' + label)
    plt.ylabel('Frequency [Hz]')
    plt.xlabel('Time [sec]')    
    plt.show()

wave_to_spect(audio_file, "audio_car")

Solution

  • With the sample data posted, here is how it can be done:

    1. First melt your DataFrame (convert from wide to long format):

      df['time'] = df.index
      df = pd.melt(df, id_vars=['time'])
      df.head()
      df.columns = ['time', 'freq', 'value']
      df.shape
      # (12, 3)
      df.head()
      #    time    freq    value
      #0   0   172.265625  1.610974e+12
      #1   1   172.265625  4.741489e+12
      #2   2   172.265625  1.987769e+11
      #3   0   344.53125   1.261974e+12
      #4   1   344.53125   3.278778e+12
      
    2. Now visualize with altair using mark_rect() (use color channel to encode the values):

      alt.Chart(df).mark_rect().encode(
       x='time:O',
       y='freq:O',
       color='value:Q'
      )
      

    to obtain the following figure:

    enter image description here