After transposing data I'd like to re-assign attributes that are dropped. This could also be applicable to copying attributes from one data frame to another. Or copying attributes after mutates, etc., where they are dropped.
df <- data.frame(id = c(1,2,3,4,5),
time = c(11, 22,33,44,55),
c = c(1,2,3,5,5),
d = c(4,2,5,4,NA))
attr(df$id,"label")<- "label"
attr(df$time,"label")<- "label2"
attr(df$c,"label")<- "something here"
attr(df$d,"label")<- "count of something"
data.frame': 5 obs. of 4 variables:
$ id : num 1 2 3 4 5
..- attr(*, "label")= chr "label"
$ time: num 11 22 33 44 55
..- attr(*, "label")= chr "label2"
$ c : num 1 2 3 5 5
..- attr(*, "label")= chr "something here"
$ d : num 4 2 5 4 NA
..- attr(*, "label")= chr "count of something"
Cast to wide
dfwide<- recast(df,id~variable +time,
id.var = c("id","time"))
Usual attribute lost message:
Warning message:
attributes are not identical across measure variables; they will be dropped
'data.frame': 5 obs. of 11 variables:
$ id : num 1 2 3 4 5
$ c_11: num 1 NA NA NA NA
$ c_22: num NA 2 NA NA NA
$ c_33: num NA NA 3 NA NA
$ c_44: num NA NA NA 5 NA
$ c_55: num NA NA NA NA 5
$ d_11: num 4 NA NA NA NA
$ d_22: num NA 2 NA NA NA
$ d_33: num NA NA 5 NA NA
$ d_44: num NA NA NA 4 NA
$ d_55: num NA NA NA NA NA
Using mostattributes
one can copy attributes between dataframes, but for iterations over many column names I can't figure out or think about how to map this efficiently in a different way save one by one.
> str(dfwide)
'data.frame': 5 obs. of 11 variables:
$ id : num 1 2 3 4 5
$ c_11: num 1 NA NA NA NA
..- attr(*, "label")= chr "something here"
$ c_22: num NA 2 NA NA NA
..- attr(*, "label")= chr "something here"
$ c_33: num NA NA 3 NA NA
I was trying to automate it but failed (all c's should have same labels and d's have same labels):
#extract arguments
pull(., value)
pull(., value)
mostatt<- function(var1, var2) {
'data.frame': 5 obs. of 11 variables:
$ id : num 1 2 3 4 5
$ c_11: num 1 NA NA NA NA
..- attr(*, "label")= chr "something here"
$ c_22: num NA 2 NA NA NA
..- attr(*, "label")= chr "count of something"
$ c_33: num NA NA 3 NA NA
..- attr(*, "label")= chr "something here"
$ c_44: num NA NA NA 5 NA
..- attr(*, "label")= chr "count of something"
$ c_55: num NA NA NA NA 5
..- attr(*, "label")= chr "something here"
$ d_11: num 4 NA NA NA NA
..- attr(*, "label")= chr "count of something"
$ d_22: num NA 2 NA NA NA
..- attr(*, "label")= chr "something here"
$ d_33: num NA NA 5 NA NA
..- attr(*, "label")= chr "count of something"
$ d_44: num NA NA NA 4 NA
..- attr(*, "label")= chr "something here"
$ d_55: num NA NA NA NA NA
..- attr(*, "label")= chr "count of something"
I think using tidyselect
might be worth a try but not sure how to incorporate it. Any suggestions would be appreciated. Thank you!
This is an option:
for(i in (setdiff(colnames(df), "id"))){
for(x in colnames(dfwide)[(grepl(i, colnames(dfwide)))])
mostattributes(dfwide[[x]]) <- attributes(df[[i]])
mostattributes(dfwide$id) <- attributes(df$id)
Because d
is contained in id
I need to rewrite id
at the end.
If you change d
for e
is even simplier:
df <- data.frame(id = c(1,2,3,4,5),
time = c(11, 22,33,44,55),
c = c(1,2,3,5,5),
e = c(4,2,5,4,NA))
attr(df$id,"label")<- "label"
attr(df$time,"label")<- "label2"
attr(df$c,"label")<- "something here"
attr(df$e,"label")<- "count of something"
dfwide<- recast(df,id~variable +time,
id.var = c("id","time"))
for(i in (colnames(df))){
for(x in colnames(dfwide)[(grepl(i, colnames(dfwide)))])
mostattributes(dfwide[[x]]) <- attributes(df[[i]])