I am looking for a solution to a problem similar to this one: Match rows of a matrix to the rows of another, regardless of the column order?
But the difference is that I am using strings and not numerals:
df1 <- data.frame (name_1 = c("A", "B", "C"),
name_2 = c("C", "D", "B"))
df2 <- data.frame (name_1 = c("C", "D", "C"),
name_2 = c("A", "B", "D"))
but, in my real data, it is a full string consisting of numbers and letters:
df1 <- data.frame (names_1 = c("AB1C", "BR2G"),
names_2 = c("CFG", "DOR5F"))
df2 <- data.frame (names_1 = c("CFG", "DOR5F"),
names_2 = c("AB1C", "BR2G"))
such that:
>function(df1,df2, by = c("names_1", "names_2")) # I was trying merge
would either return a data.table:
Genes_1 Genes_2
1 C A
2 D B
or would return indexes from one that match to the matching rows in another:
1 2
I think I could probably just create a transposed duplicate of one of the 2 tables and run merge two separate times:
df3 <- df2
names(df3) <- c("name_2", "name_1")
merge(df1, df2, by = c("name_1", "name_2") )
#no result
merge(df1, df3, by = c("name_1", "name_2") )
name_1 name_2
1 A C
2 B D
Seems you are interested in merging without considering the order. If there are more than 2 variables, consider sorting rowwise. If only two variables as in your case, consider rbinding the second dataframe to a reversed version of itself before merging:
merge(df1, rbind(df2, setNames(rev(df2),names(df2))))
name_1 name_2
1 A C
2 B D