Search code examples
pandasseabornfacet-gridline-plot

Using units in seaborn FacetGrid with lineplot


Here is a Minimal, Complete and Verifiable toy dataset for my problem:

genes = pd.Series(["Gene1"] * 16 + ["Gene2"] * 16)
conditions = pd.Series(np.tile(np.array(["Condition1"] * 8 + ["Condition2"] * 8), 2))
wellID = pd.Series(np.array(["W1"] * 4 + ["W2"] * 4 + ["W3"] * 4 + ["W4"] * 4 + ["W5"] * 4 + ["W6"] * 4 + ["W7"] * 4 + ["W8"] * 4))
fluo = pd.Series(np.array([np.sort(np.random.logistic(size=4)) for _ in range(8)]).flatten())
cycles = pd.Series(np.tile(np.array([0, 1, 2, 3]), 8))
df = pd.concat([genes, conditions, wellID, cycles, fluo], axis=1)
df.columns = ["Gene", "Condition", "WellID", "Cycle", "Fluo"]

It is composed of 2 genes for 2 conditions having each 2 replicates (1 replicate has 1 unique WellID for which there are 4 cycles, 1 measured fluo point per cycle).

I'm able to create the line plot I want isolating one gene with this command:

sns.lineplot(x="Cycle", y="Fluo", hue="Condition", units="WellID", estimator=None, data=df.loc[df.Gene == "Gene1"])

I had to use both units and estimator so that I can see the 2 replicates (and not an aggregated curve per Gene/Condition.

Finally, I wanted to use FacetGrid to see this plot for the 2 genes so I did:

g = sns.FacetGrid(df, col="Gene", hue="Condition", col_wrap=5)
g.map(sns.lineplot, "Cycle", "Fluo");

But then if I had the keywords "units" and "estimator", I have an error saying the "ValueError: Could not interpret input 'WellID'".

I could only display the plots with the 2 replicates aggregated.


Solution

  • Pass it as args to the lineplot function

    g = sns.FacetGrid(df, col="Gene", hue="Condition", col_wrap=5)
    g.map(sns.lineplot, "Cycle","Fluo", "WellID")
    plt.show()
    
    

    enter image description here