Hi to all the community. I search in the forum, but unsuccessfully for this "easy" question. May be there is already a similar question? I have the following dataframe:
ID<-c(rep(seq(1:5),4))
LAB<-c("A","B","C","A")
datain<-data.frame(cbind(ID,LAB))
I would like to know if exist a function in R to get for each ID the different values(LAB) without duplicates? Like:
ID<-c(rep(seq(1:5),4))
LAB<-c("A B C")
dataout<-data.frame(cbind(ID,LAB))
dataout
ID LAB
1 1 A B C
2 2 A B C
3 3 A B C
4 4 A B C
5 5 A B C
6 1 A B C
7 2 A B C
8 3 A B C
9 4 A B C
10 5 A B C
11 1 A B C
12 2 A B C
13 3 A B C
14 4 A B C
15 5 A B C
16 1 A B C
17 2 A B C
18 3 A B C
19 4 A B C
20 5 A B C
My apologies for not have specify the output before!!!
As always any help is greatly appreciated!
I think you are looking for split
:
with(datain, split(LAB, ID))
# $`1`
# [1] A B C A
# Levels: A B C
#
# $`2`
# [1] B C A A
# Levels: A B C
#
# $`3`
# [1] C A A B
# Levels: A B C
#
# $`4`
# [1] A A B C
# Levels: A B C
#
# $`5`
# [1] A B C A
# Levels: A B C
Since each ID
might have a different number of LAB
s, the output is a list.
Edit: Since it now appears you only wanted unique values, do:
with(unique(datain), split(LAB, ID))
and if you don't like getting factors, do:
with(unique(datain), split(as.character(LAB), ID))