Search code examples
pythondataframedata-visualizationseabornridgeline-plot

Ridgeline/Joyplot across a moving range


(Using Python 3.0) In increments of 0.25, I want to calculate and plot PDFs for the given data across specified ranges for easy visualization.

Calculating the individual plot has been done thanks to the SO community, but I cannot quite get the algorithm right to iterate properly across the range of values.

Data: https://www.dropbox.com/s/y78pynq9onyw9iu/Data.csv?dl=0

What I have so far is normalized toy data that looks like a shotgun blast with one of the target areas isolated between the black lines with an increment of 0.25:

import csv
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
from matplotlib import pyplot as plt
import seaborn as sns
Data=pd.read_csv("Data.csv")

g = sns.jointplot(x="x", y="y", data=Data)

bottom_lim = 0
top_lim = 0.25
temp = Data.loc[(Data.y>=bottom_lim)&(Data.y<top_lim)]
g.ax_joint.axhline(top_lim, c='k', lw=2)
g.ax_joint.axhline(bottom_lim, c='k', lw=2)

# we have to create a secondary y-axis to the joint-plot, otherwise the kde 
might be very small compared to the scale of the original y-axis
ax_joint_2 = g.ax_joint.twinx()
sns.kdeplot(temp.x, shade=True, color='red', ax=ax_joint_2, legend=False)
ax_joint_2.spines['right'].set_visible(False)
ax_joint_2.spines['top'].set_visible(False)
ax_joint_2.yaxis.set_visible(False)

enter image description here

And now what I want to do is make a ridgeline/joyplot of this data across each 0.25 band of data.

I tried a few techniques from the various Seaborn examples out there, but nothing really accounts for the band or range of values as the y-axis. I'm struggling to translate my written algorithm into working code as a result.


Solution

  • I don't know if this is exactly what you are looking for, but hopefully this gets you in the ballpark. I also know very little about python, so here is some R:

    library(tidyverse)
    library(ggridges)
    data = read_csv("https://www.dropbox.com/s/y78pynq9onyw9iu/Data.csv?dl=1")
    
    data2 = data %>%
      mutate(breaks = cut(x, breaks = seq(-1,7,.5), labels = FALSE))
    
    data2 %>%
      ggplot(aes(x=x,y=breaks)) +
      geom_density_ridges() +
      facet_grid(~breaks, scales = "free")
    
    data2 %>%
      ggplot(aes(x=x,y=y)) +
      geom_point() +
      geom_density() +
      facet_grid(~breaks, scales = "free")
    

    And please forgive the poorly formatted axis.

    Image