Search code examples
exceldataframematplotlibplotcolormap

Plotting multiple datasets in single graph


I have many datasets taken from multiple excel files that I would like to plot on the same graph each with a different color. I have created 4 spreadsheets with random data for testing. The first column defines the measurement, the code should select one of this containing 5 rows of data (X, Y), and add them to a dataframe. The results should be 1 dataset for every file to be plot all together on the same graph and having each plot of a different color.

Spreadsheets

I have been using modified pieces of codes taken on here from people which were trying to do the same thing. The problem is that I cannot color each plot differently because the program counts them as one, because due to the pd.concat() it merge these into 1 line. Do you know how I could overcome this?

Other questions asking to plot multiple datasets in single graph are almost all about a small number of dataset, while in my case I have like 50, thus cannot create a subplot for each one of them, unless there is a way to do this automatically

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import glob
import os
from os import path
import sys
import openpyxl 

# create a list of all excel files in the directory
xlsx_files=glob.glob(r'C:\Users\exx762\Desktop\*.xlsx')
files=[]
n=len(xlsx_files)
index=0

# select chunk of data needed from each file and add to dataframe
for file in xlsx_files:
    index+=1
    files.append(pd.read_excel(file))
    df_files=pd.concat(files)
    ph_loops=df_files[df_files['Measurement']==2]
    x = ph_loops['X']
    y = ph_loops['Y']

# plot elements in the dataframe
    ax=plt.subplot()
    colors=plt.cm.jet(np.linspace(0, 1, n))
    ax.set_prop_cycle('color', list(colors))
    ax.plot(x, y, marker='.', c=colors[index-1], linewidth=0.5, markersize=2)
    print(colors[index-1])
    ax.tick_params(axis='y', color='k')
    ax.set_xlabel('X', fontsize=12, weight='bold')
    ax.set_ylabel('Y', fontsize=12, weight='bold')
    ax.set_title(file+'\n')
    ax.tick_params(width=2)
    plt.plot()
plt.show()

> Actual result


Solution

  • You can add an id field (I used name below) to the dataframes as you concatenate them, then you can plot in a loop. Example:

    # Create example dataframes
    dfs = []
    for i in range(1, 4):
        df = pd.DataFrame(np.random.randn(10, 2), columns=['x', 'y'])
        df.insert(0, 'name', i)
        dfs.append(df)
    
    result = pd.concat(dfs, ignore_index=True)
    
    # Plot
    fig, ax = plt.subplots()
    for name, group in result.groupby('name'):
        group.plot(x='x', y='y', ax=ax, label=name)
    
    plt.show()
    

    enter image description here