I am reading multiple csv files (20 files) and finally creating one dataframe. Though I manually checked by eye, the column names were same. However, for some reason I get the below error.
Error in match.names(clabs, names(xi)) : names do not match previous names
This is the code that I wrote
fnames <- list.files("C:/Users/code",pattern='^La') # getting all the files from directory. Update it as required
csv <- lapply(fnames,read.csv) # reading all the files
source_DF <- do.call(rbind, lapply(csv, '[', 1:8)) # This is the line where it throws error
Please note that I use 1:8
because some times R reads uneven number of columns. For example, all my csv files only has 8 columns but when it reads, sometimes it has 12 and some has even 50. So, to avoid that I have got 1:8
. Any other approach to read first 8 columns is also welcome
How can I find out which of the csv file has this naming issue and what are the columns that is causing this issue?
Any help to resolve this error would really be helpful
One way to check it would be to subset first 8 columns from each dataframe, get the common names present in all the dataframe, then use setdiff
to find out if there is any mismatch of column names
list_df <- lapply(csv, '[', 1:8)
cols <- Reduce(intersect, lapply(list_df, names))
lapply(list_df, function(x) setdiff(names(x), cols))
If all your column names are the same you should get character(0)
as output for each dataframe. If there is any mismatch setdiff
will display the name of the column.
Also another hint to check would be is length(cols)
8 ?