I have 15 datasets or data frames, let them be named data_1
to data_15
. All suppose to have the same columns names. I would like to check if all columns have the same name and position before concatenate them. I concatenated them and I ended with an extra column because one column name of one dataset was misspelled. I used the following following code per dataset, but I would like to improve my skills and save time.
print(list(data_1))
The code I use to concatenate all datasets is the following:
pd.concat([data_1; data_2...data_15])
Put all the dataframes in a list, then use all()
to test if the column names are all the same.
columns = list(data_1.columns.values)
df_list = [data_2, ..., data_15]
if all(list(df.columns.values) == columns for df in df_list):
# code that concatenates all the dataframes
else:
print("Columns don't match")