I have the following dataframe in R
DF2<-data.frame("ID"=c("A", "A", "A", "B", "B", "B", "B", 'B'),
'Freq'=c(1,2,3,1,2,3,4,5), "Val"=c(1,2,4, 2,3,4,5,8))
The datframe has the following appearance
ID Freq Val
1 A 1 1
2 A 2 2
3 A 3 4
4 B 1 2
5 B 2 3
6 B 3 4
7 B 4 5
8 B 5 8
I want to melt and recast the dataframe to yield the following dataframe
A_Freq A_Value B_Freq B_Value
1 1 1 1 2
2 2 2 2 3
3 3 4 3 4
4 NA NA 4 5
5 NA NA 5 8
I have tried the following code
DF3<-melt(DF2, by=ID)
DF3$ID<-paste0(DF3$ID, DF3$variable)
DF3$variable<-NULL
DF4<-dcast(DF3, value~ID)
This yields the following dataframe
value AFreq AVal BFreq BVal
1 1 1 1 1 NA
2 2 2 2 2 2
3 3 3 NA 3 3
4 4 NA 4 4 4
5 5 NA NA 5 5
6 8 NA NA NA 8
How can I obtain the above result. I have tried other variations of dcast but am unable to obtain the desired result. request someone to help
One option would be
library(tidyverse)
DF2 %>%
gather(key, val, -ID) %>%
unite(IDkey, ID, key) %>%
group_by(IDkey) %>%
mutate(rn = row_number()) %>%
spread(IDkey, val) %>%
select(-rn)
# A tibble: 5 x 4
# A_Freq A_Val B_Freq B_Val
# <dbl> <dbl> <dbl> <dbl>
#1 1 1 1 2
#2 2 2 2 3
#3 3 4 3 4
#4 NA NA 4 5
#5 NA NA 5 8
Or using melt/dcast
. We melt
, by specifying the id.var
as "ID" (as a string) to convert from 'wide' to 'long' format. Then using dcast
, reshape from 'long' to 'wide' with the expression rowid(ID, variable) ~ paste(ID, variable, sep="_")
. The rhs
of ~
paste
the column values together, while rowid
get the sequence id for the ID, variable columns.
library(data.table)
dcast(melt(setDT(DF2), id.var = "ID"), rowid(ID, variable) ~
paste(ID, variable, sep="_"))[, ID := NULL][]
# A_Freq A_Val B_Freq B_Val
#1: 1 1 1 2
#2: 2 2 2 3
#3: 3 4 3 4
#4: NA NA 4 5
#5: NA NA 5 8
In the OP's code, the expression is value ~ ID
, so it create a column 'value' with each unique element of 'value' and at the same time, automatically picks up the value.var
as 'value' resulting in more rows than expected