Search code examples
pythoncsvparsingglobgaussian

Iterating through a directory of csv files and storing the columns in an array


I am trying to iterate through a directory of csv files. My.csv files contains 2 columns (x values and y values). I want to loop through each of the files and store the x and y values in an array and plot an x-y graph for all of the files and visualize it. I have attached the code I am working on and am unable to produce the output. My sample csv file is:

#x0                  y0 ###################
#-7.66E-06,          17763###################
#-7.60E-06,          2853#####################
#-7.53E-06..etc,     3694...etc####################

And I have tried this piece of code but it is not giving me the expected result

import cv
import glob
path=r"E:\Users\...\...\qudi"
files=glob.glob(path,'*.csv')
data_frame=pd.DataFrame()
xData=[]
yData=[]

for file in files:
    #reading the content of the csv file
    df=pd.read_csv(file,index_col=None)
    content.append(df)
# converting content to data frame
data_frame=pd.concat(content)
print(data_frame)
#     with open(path,"r") as f_in:
#         reader=csv.reader(f_in)
#         next(reader)
#         for line in reader:
#             try:
#                 print(line)
#                 float_1,float_2=float(line[0]),float(line[1])
#                 xData.append(float_1)
#                 yData.append(float_2)
#             except ValueError:
#                     continue 

Any suggestions would go a long way.


Solution

  • You can create your main df by appending to it each time you convert the csv file, then you will have no need for pd.concat. You need to create the empty main df first -- your current code tries to append to content without creating content first.

    df_main = pd.dataframe()
    for file in files:
        df=pd.read_csv(file,index_col=None)
        df_main = df_main.append(df)