Reading multiple CSV files to plot multiple curves on same graph with Python

I run several numerical computations and the results of each computation are stored in a .csv file. Lets say data1.csv, data2.csv, data3.csv, .etc that are composed of 4 columns. I would like to read column 2 and 4 of several csv files and plot the curves presenting column 4 as a function of column 2 of the same graph to compare numerical computations.

I currently succeed in plotting 1 curve but not to automatize the procedure for n .csv file.

Here is my code :

x = []
y = []
path='/ref_path/'
calcul_id='ref_computation/'
file='data1.csv'
file_in=path+calcul_id+file
with open(file_in,'r') as csvfile:
    plots = csv.reader(csvfile, delimiter = ',')
    for row in plots:
        x.append(float(row[1]))
        y.append(float(row[3]))
  
plt.plot(x,y)
plt.show()

Can you help me ? Thanks a lot !

Solution

Essentially, what you want to do is to iterate over the folder containing all your csv files. You can use the glob module, which is a part of Python standard library.

Your code will look something like this :

import glob
import csv
import matplotlib.pyplot as plt

directory_countaining_csv_files = '...'

number_of_files = len(glob.glob(f'{directory_countaining_csv_files}/data*.csv'))

for filepath in glob.iglob(f'{directory_countaining_csv_files}/data*.csv'):
    x = []
    y = []
    with open(filepath,'r') as csvfile:
        plots = csv.reader(csvfile, delimiter = ',')
        for row in plots:
            x.append(float(row[1]))
            y.append(float(row[3]))

    plt.plot(x, y, label=f'{filepath}')

#Get labels from legends 
handles, labels = plt.gca().get_legend_handles_labels()

#specify order of items in legend
order = [i for i in range(number_of_files)]

plt.legend([handles[idx] for idx in order],[labels[idx] for idx in order]) 

plt.show()

The argument 'directory_countaining_files/data*.csv' will make sure that glob.iglob will return every csv file that starts with "data". I advise you to take a look at python documentation : https://docs.python.org/fr/3.6/library/glob.html

I added a way to order legends in the final plot, i found the idea from this example : https://www.statology.org/matplotlib-legend-order/ .

This implementation can be awkward, 2 other ways to do it would be :

Sort files inside your folder by hand.
Use glob.glob() instead of glob.iglob().

glob.glob() will return a list of csv files in your directory. You can sort this list and iterate over it, the rest of the code will be the same.

list_csv = glob.glob(f'{directory_countaining_csv_files}/data*.csv')
list_csv.sort()

for file in liste_csv:
   x=[]
   y=[]
    ... same code as before ...