Search code examples
pythonpandasnumpymatplotlibplotly

How to plot numpy arrays in pandas dataframe


I have the DataFrame:

df = 

sample_type         observed_data
     A          [0.2, 0.5, 0.17, 0.1]
     A          [0.9, 0.3, 0.24, 0.5]
     A          [0.9, 0.5, 0.6, 0.39]
     B          [0.01, 0.07, 0.15, 0.26]
     B          [0.08, 0.14, 0.32, 0.58]
     B          [0.01, 0.16, 0.42, 0.41]

where the data type in the observed_data column is np.array. What's the easiest and most efficient way of plotting each of the numpy arrays overlayed on the same plot using matplotlib and/or plotly and showing A and B as separate colors or line types (eg. dashed, dotted, etc.)?


Solution

  • You can use this...

    df = pd.DataFrame({'sample_type' : ['A', 'A', 'A', 'B', 'B', 'B'], 
                       'observed_data' : [[0.2, 0.5, 0.17, 0.1], [0.9, 0.3, 0.24, 0.5], [0.9, 0.5, 0.6, 0.39], 
                                          [0.01, 0.07, 0.15, 0.26], [0.08, 0.14, 0.32, 0.58], [0.01, 0.16, 0.42, 0.41]]})
    
    for ind, cell in df['observed_data'].iteritems():
        if len(cell) > 0:
            if df.loc[ind,'sample_type'] == 'A':
                plotted = plt.plot(np.linspace(0,1,len(cell)), cell, color='blue', marker = 'o', linestyle = '-.')
            else:
                plotted = plt.plot(np.linspace(0,1,len(cell)), cell, color='red', marker = '*', linestyle = ':')
    plt.show()
    

    enter image description here