Search code examples
pythonpandasmatplotlibseabornbokeh

Single variable category scatter plot pandas


Is It possible to plot single value as scatter plot? I can very well plot it in line by getting the ccdfs with markers but I want to know if any alternative is available?

Input:

Input 1

tweetcricscore 51 high active

Input 2

tweetcricscore 46 event based
tweetcricscore 12 event based
tweetcricscore 46 event based

Input 3

tweetcricscore 1 viewers 
tweetcricscore 178 viewers

Input 4

tweetcricscore 46 situational
tweetcricscore 23 situational
tweetcricscore 1 situational
tweetcricscore 8 situational
tweetcricscore 56 situational

I can very much write scatter plot code with bokeh and pandas using x and y values. But in case of single value ?

When all the inputs are merged as one input and are to be grouped by col[3], values are col[2].

The code below is for data set with 2 variables

import numpy as np
import matplotlib.pyplot as plt
from pylab import*
import math
from matplotlib.ticker import LogLocator
import pandas as pd
from bokeh.charts import Scatter, output_file, show

df = pd.read_csv('input.csv', header = None)

df.columns = ['col1','col2','col3','col4']

scatter = Scatter( df, x='col2', y='col3', color='col4', marker='col4', title='plot', legend=True)

output_file('output.html', title='output')

show(scatter)

Sample Output

enter image description here


Solution

  • UPDATE:

    look at Bokeh and Seaborn galleries - it might help you to understand what kind of plot fits your needs

    you may try violinplot like this:

    sns.violinplot(x="category", y="val", data=df)
    

    enter image description here

    or HeatMaps:

    import numpy as np
    import pandas as pd
    from bokeh.charts import HeatMap, output_file, show
    
    cats = ['active', 'based', 'viewers', 'situational']
    df = pd.DataFrame({'val': np.random.randint(1,100, 1000), 'category': np.random.choice(cats, 1000)})
    
    hm = HeatMap(df)
    output_file('d:/temp/heatmap.html')
    show(hm)