I have many datasets taken from multiple excel files that I would like to plot on the same graph each with a different color. I have created 4 spreadsheets with random data for testing. The first column defines the measurement, the code should select one of this containing 5 rows of data (X, Y), and add them to a dataframe. The results should be 1 dataset for every file to be plot all together on the same graph and having each plot of a different color.
I have been using modified pieces of codes taken on here from people which were trying to do the same thing. The problem is that I cannot color each plot differently because the program counts them as one, because due to the pd.concat()
it merge these into 1 line. Do you know how I could overcome this?
Other questions asking to plot multiple datasets in single graph are almost all about a small number of dataset, while in my case I have like 50, thus cannot create a subplot for each one of them, unless there is a way to do this automatically
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import glob
import os
from os import path
import sys
import openpyxl
# create a list of all excel files in the directory
xlsx_files=glob.glob(r'C:\Users\exx762\Desktop\*.xlsx')
files=[]
n=len(xlsx_files)
index=0
# select chunk of data needed from each file and add to dataframe
for file in xlsx_files:
index+=1
files.append(pd.read_excel(file))
df_files=pd.concat(files)
ph_loops=df_files[df_files['Measurement']==2]
x = ph_loops['X']
y = ph_loops['Y']
# plot elements in the dataframe
ax=plt.subplot()
colors=plt.cm.jet(np.linspace(0, 1, n))
ax.set_prop_cycle('color', list(colors))
ax.plot(x, y, marker='.', c=colors[index-1], linewidth=0.5, markersize=2)
print(colors[index-1])
ax.tick_params(axis='y', color='k')
ax.set_xlabel('X', fontsize=12, weight='bold')
ax.set_ylabel('Y', fontsize=12, weight='bold')
ax.set_title(file+'\n')
ax.tick_params(width=2)
plt.plot()
plt.show()
You can add an id field (I used name
below) to the dataframes as you concatenate them, then you can plot in a loop. Example:
# Create example dataframes
dfs = []
for i in range(1, 4):
df = pd.DataFrame(np.random.randn(10, 2), columns=['x', 'y'])
df.insert(0, 'name', i)
dfs.append(df)
result = pd.concat(dfs, ignore_index=True)
# Plot
fig, ax = plt.subplots()
for name, group in result.groupby('name'):
group.plot(x='x', y='y', ax=ax, label=name)
plt.show()