Search code examples
pythonarrayspandasconcatenationglob

Parsing starting points of concated files


I have 3 csv files in 3 different folders that I need to merge and then do averages on values of each individual line (there is about 4000 lines in each file.

I have managed to combine panda dataframe with glob and access file needed. However, when I concate the files, the order is different than I want to.

enter image description here

path = '/home/alispahic/1.CB1_project/12.Production_Runs/'
all_files = glob.glob(path + '*/3.IVa*/rmsf.csv')

li = []

for filename in all_files:
    data = pd.read_csv(filename, index_col=None, header=0)
    data['Atom']=data['Atom'].astype(int)
    data['(nm)']=data['(nm)'].astype(float)

    df1=data['Atom']
    df2=data['(nm)']


    li.append(df2)

frame = pd.concat(li, axis=0, ignore_index=True)

What I want to do is to have an output where the order of values of these files will not be just merged into one column, but rather have 3 columns of 4000 rows and access values like that.


Solution

  • You need to concatenate along the column axis to get 3 columns:

    frame = pd.concat(li, axis=1, ignore_index=True)