I have an xlsx file that contains let's say 4 columns:
Thingie1
Thingie1
Thingie2
Thingie2
(I know it's stupid... it's because I am combining data from 2 different databases, and the 1
and 2
denote a difference between survey 1 and survey 2 and chose to dump out into a xlsx file.
When I import this into R using the xlsx package, it changes the column titles to:
Thingie1
Thingie3
Thingie2
Thingie4
Obviously my real-world example is much more complex and it then becomes very difficult to identify what is what. I would like to have something like:
Thingie1
Thingie11
Thingie2
Thingie21
Now, I can resolve this by opening the file in excel and changing the columns there... so I can solve my immediate problem, but I hate breaking my workflow... would there be some way to import this data in a better way into R itself?
I think you should read the first line (containing column names) alone of your file with header=FALSE
for proper naming of columns. Then use make.unique
function upon it. Later read the whole file with header=TRUE
and set the column names using colnames
function.
x <- c("Thingie1", "Thingie1", "Thingie2", "Thingie2")
x
#[1] "Thingie1" "Thingie1" "Thingie2" "Thingie2"
make.unique(x)
#[1] "Thingie1" "Thingie1.1" "Thingie2" "Thingie2.1"