Given a variable x
that can take values A,B,C,D
And three columns for variable x
:
df1<-
rbind(c("A","B","C"),c("A","D","C"),c("B","A","C"),c("A","C","B"), c("B","C","A"), c("D","A","B"), c("A","B","D"), c("A","D","C"), c("A",NA,NA),c("D","A",NA),c("A","D",NA))
How do I make column indicating the combination of in the three preceding column such that permutations (ABC, ACB, BAC) would be considered as the same combination of ABC, (AD, DA) would be considered as the same combination of AD?
Pasting the three columns with apply(df1,1,function(x) paste(x[!is.na(x)], collapse=", ")->df1$x4
and using df1%>%group(x4)%>%summarize(c=count(x4))
would count AD,DA
as different instead of the same.
Edited title
My desired result would be to get a<-cbind(c("ABC",4),c("ACD",2),c("ABD",2),c("A",1),c("AD",2))
Someone already solved my question. Thanks
You can apply
function paste
after sorting each row vector.
df1 <-
cbind(df1, apply(df1, 1, function(x) paste(sort(x), collapse = "")))
df1
# [,1] [,2] [,3] [,4]
# [1,] "A" "B" "C" "ABC"
# [2,] "A" "D" "C" "ACD"
# [3,] "B" "A" "C" "ABC"
# [4,] "A" "C" "B" "ABC"
# [5,] "B" "C" "A" "ABC"
# [6,] "D" "A" "B" "ABD"
# [7,] "A" "B" "D" "ABD"
# [8,] "A" "D" "C" "ACD"
# [9,] "A" NA NA "A"
#[10,] "D" "A" NA "AD"
#[11,] "A" "D" NA "AD"
You can now simply table
the column, with no need for an external package to be loaded and more complex pipes.
table(df1[, 4])
#A ABC ABD ACD AD
#1 4 2 2 2