Search code examples
rmergedataframereshape

Merge Multiple Data Frames by Row Names


I'm trying to merge multiple data frames by row names.

I know how to do it with two:

x = data.frame(a = c(1,2,3), row.names = letters[1:3])
y = data.frame(b = c(1,2,3), row.names = letters[1:3])
merge(x,y, by = "row.names")

But when I try using the reshape package's merge_all() I'm getting an error.

z = data.frame(c = c(1,2,3), row.names = letters[1:3])
l = list(x,y,z)
merge_all(l, by = "row.names")

Error in -ncol(df) : invalid argument to unary operator

What's the best way to do this?


Solution

  • Merging by row.names does weird things - it creates a column called Row.names, which makes subsequent merges hard.

    To avoid that issue you can instead create a column with the row names (which is generally a better idea anyway - row names are very limited and hard to manipulate). One way of doing that with the data as given in OP (not the most optimal way, for more optimal and easier ways of dealing with rectangular data I recommend getting to know data.table instead):

    Reduce(merge, lapply(l, function(x) data.frame(x, rn = row.names(x))))