After transposing data I'd like to re-assign attributes that are dropped. This could also be applicable to copying attributes from one data frame to another. Or copying attributes after mutates, etc., where they are dropped.
library(reshape2)
df <- data.frame(id = c(1,2,3,4,5),
time = c(11, 22,33,44,55),
c = c(1,2,3,5,5),
d = c(4,2,5,4,NA))
attr(df$id,"label")<- "label"
attr(df$time,"label")<- "label2"
attr(df$c,"label")<- "something here"
attr(df$d,"label")<- "count of something"
str(df)
str(df)
data.frame': 5 obs. of 4 variables:
$ id : num 1 2 3 4 5
..- attr(*, "label")= chr "label"
$ time: num 11 22 33 44 55
..- attr(*, "label")= chr "label2"
$ c : num 1 2 3 5 5
..- attr(*, "label")= chr "something here"
$ d : num 4 2 5 4 NA
..- attr(*, "label")= chr "count of something"
Cast to wide
dfwide<- recast(df,id~variable +time,
id.var = c("id","time"))
Usual attribute lost message:
Warning message:
attributes are not identical across measure variables; they will be dropped
str(dfwide)
'data.frame': 5 obs. of 11 variables:
$ id : num 1 2 3 4 5
$ c_11: num 1 NA NA NA NA
$ c_22: num NA 2 NA NA NA
$ c_33: num NA NA 3 NA NA
$ c_44: num NA NA NA 5 NA
$ c_55: num NA NA NA NA 5
$ d_11: num 4 NA NA NA NA
$ d_22: num NA 2 NA NA NA
$ d_33: num NA NA 5 NA NA
$ d_44: num NA NA NA 4 NA
$ d_55: num NA NA NA NA NA
Using mostattributes
one can copy attributes between dataframes, but for iterations over many column names I can't figure out or think about how to map this efficiently in a different way save one by one.
mostattributes(dfwide$c_11)<-attributes(df$c)
mostattributes(dfwide$c_22)<-attributes(df$c)
> str(dfwide)
'data.frame': 5 obs. of 11 variables:
$ id : num 1 2 3 4 5
$ c_11: num 1 NA NA NA NA
..- attr(*, "label")= chr "something here"
$ c_22: num NA 2 NA NA NA
..- attr(*, "label")= chr "something here"
$ c_33: num NA NA 3 NA NA
I was trying to automate it but failed (all c's should have same labels and d's have same labels):
#extract arguments
dlist<-enframe(names(df))%>%
slice(-1,-2)%>%
pull(., value)
dlist
dlistw<-enframe(names(dfwide))%>%
slice(-1)%>%
pull(., value)
dlistw
#function
mostatt<- function(var1, var2) {
mostattributes(dfwide[[var1]])<<-attributes(df[[var2]])
}
mapply(mostatt,dlistw,dlist)
str(dfwide)
'data.frame': 5 obs. of 11 variables:
$ id : num 1 2 3 4 5
$ c_11: num 1 NA NA NA NA
..- attr(*, "label")= chr "something here"
$ c_22: num NA 2 NA NA NA
..- attr(*, "label")= chr "count of something"
$ c_33: num NA NA 3 NA NA
..- attr(*, "label")= chr "something here"
$ c_44: num NA NA NA 5 NA
..- attr(*, "label")= chr "count of something"
$ c_55: num NA NA NA NA 5
..- attr(*, "label")= chr "something here"
$ d_11: num 4 NA NA NA NA
..- attr(*, "label")= chr "count of something"
$ d_22: num NA 2 NA NA NA
..- attr(*, "label")= chr "something here"
$ d_33: num NA NA 5 NA NA
..- attr(*, "label")= chr "count of something"
$ d_44: num NA NA NA 4 NA
..- attr(*, "label")= chr "something here"
$ d_55: num NA NA NA NA NA
..- attr(*, "label")= chr "count of something"
I think using tidyselect
starts_with
might be worth a try but not sure how to incorporate it. Any suggestions would be appreciated. Thank you!
This is an option:
for(i in (setdiff(colnames(df), "id"))){
for(x in colnames(dfwide)[(grepl(i, colnames(dfwide)))])
mostattributes(dfwide[[x]]) <- attributes(df[[i]])
}
mostattributes(dfwide$id) <- attributes(df$id)
Because d
is contained in id
I need to rewrite id
at the end.
If you change d
for e
is even simplier:
df <- data.frame(id = c(1,2,3,4,5),
time = c(11, 22,33,44,55),
c = c(1,2,3,5,5),
e = c(4,2,5,4,NA))
attr(df$id,"label")<- "label"
attr(df$time,"label")<- "label2"
attr(df$c,"label")<- "something here"
attr(df$e,"label")<- "count of something"
str(df)
dfwide<- recast(df,id~variable +time,
id.var = c("id","time"))
for(i in (colnames(df))){
for(x in colnames(dfwide)[(grepl(i, colnames(dfwide)))])
mostattributes(dfwide[[x]]) <- attributes(df[[i]])
}