Search code examples
rweightedpsych

R Loop To New Data Frame Summary Weighted


I have a tall data frame as such:

data = data.frame("id"=c(1,2,3,4,5,6,7,8,9,10),
                  "group"=c(1,1,2,1,2,2,2,2,1,2),
                  "type"=c(1,1,2,3,2,2,3,3,3,1),
                  "score1"=c(sample(1:4,10,r=T)),
                  "score2"=c(sample(1:4,10,r=T)),
                  "score3"=c(sample(1:4,10,r=T)),
                  "score4"=c(sample(1:4,10,r=T)),
                  "score5"=c(sample(1:4,10,r=T)),
                  "weight1"=c(173,109,136,189,186,146,173,102,178,174),
                  "weight2"=c(147,187,125,126,120,165,142,129,144,197),
                  "weight3"=c(103,192,102,159,128,179,195,193,135,145),
                  "weight4"=c(114,182,199,101,111,116,198,123,119,181),
                  "weight5"=c(159,125,104,171,166,154,197,124,180,154))

library(reshape2)
library(plyr)

data1 <- reshape(data, direction = "long",
                 varying = list(c(paste0("score",1:5)),c(paste0("weight",1:5))),
                 v.names = c("score","weight"),
                 idvar = "id", timevar = "count", times = c(1:5))
data1 <- data1[order(data1$id), ]

And what I want to create is a new data frame like so:

want = data.frame("score"=rep(1:4,6),
                  "group"=rep(1:2,12),
                  "type"=rep(1:3,8),
                  "weightedCOUNT"=NA) # how to calculate this? count(data1, score, wt = weight)

I am just not sure how to calculate weightedCOUNT which should apply the weights to the score variable so then it gives in column 'weightedCOUNT' a weighted count that is aggregated by score and group and type.


Solution

  • An option would be to melt (from data.table - which can take multiple measure patterns, and then grouped by 'group', 'type' get the count

    library(data.table)
    library(dplyr)
    melt(setDT(data), measure = patterns('^score', "^weight"), 
       value.name = c("score", "weight")) %>% 
       group_by(group, type) %>%
       count(score, wt = weight)
    

    If we need to have a complete set of combinations

    library(tidyr)
    melt(setDT(data), measure = patterns('^score', "^weight"), 
           value.name = c("score", "weight")) %>%      
       group_by(group, type) %>%
       ungroup %>% 
       complete(group, type, score, fill = list(n = 0))