I am trying to extract specific columns from multiple csv files and merge them into one. Each file contains 265 columns and extracting specific columns with their index number is very difficult. Is their an efficient do accomplish these task.
I have around 120 csv files.
Here's a solution in pandas to extract two columns by name from all *.csv files in the current directory.
The code:
import pandas as pd
from glob import glob
seek_cols = ["FccFaultB1", "FccFaultB2"]
infiles = glob("*.csv")
df = pd.DataFrame(columns=seek_cols)
for infile in infiles:
data = pd.read_csv(infile)[seek_cols]
df = df.append(data)
In my case, test1.csv:
FccFaultB0,FccFaultB1,FccFaultB2,FccFaultB3
0,0,0,0
and test2.csv:
FccFaultB0,FccFaultB1,FccFaultB2,FccFaultB3
1,1,1,1
Resulting in df:
FccFaultB1 FccFaultB2
0 1 1
0 0 0