Search code examples
rstatisticsplyrcategorical-datacontinuous

Mean of categorical variables from plyr package


My categorical variable, risk has three groups in it of: ADV, HHM and POV

I want get the mean these three groups for four continuous variables read.5, read.6, read.7 and read.8 which are reading scores of individuals over grades 5 to 8

which is the ,2:5 of my dataset and it's an old textbook example. I used the code below which is not correct apparently even though it is supposed to be correct according to the texbook example:

myrisk <- ddply(.data = MPLS[ ,2:5], .variables = .(MPLS$risk),
                .fun = mean, na.rm = TRUE)

I had an error message for a piece of code earlier on of:

mymeans <- mean(MPLS[ ,2:5], na.rm = TRUE)

which when I googled it, the R software had changed and I had to find another to work out the means.

My questions are:

  1. Is the ddply function which I am trying to use currently, from the plyr package been superseded in the same way that the old mean function has?

  2. How do I get the mean of a categorical variable from the four columns? Whether with the same function or with something different?

Thank you


Solution

  • Hi you can use dplyr - its more up to date.

     df<-data.frame(risk= rep(c("ADV","HHM","POV"),10),
                    read.5= rnorm(30,30),
                    read.4= rnorm(30,30),
                    read.3= rnorm(30,30),
                    read.2= rnorm(30,30))
    > head(df)
    #  risk   read.5   read.4   read.3   read.2
    #1  ADV 30.78281 30.00721 29.80906 29.25936
    #2  HHM 29.76175 29.63864 29.39256 29.40070
    #3  POV 29.00964 30.48258 29.20662 28.77509
    #4  ADV 29.60631 30.35032 32.00376 30.70374
    #5  HHM 31.38653 30.28896 29.48756 30.32430
    #6  POV 30.33102 30.40897 29.55796 30.10585
    
    library(dplyr)
    
    df %>% group_by(risk) %>% summarise_all(mean)
    
    # A tibble: 3 x 5
    #  risk  read.5 read.4 read.3 read.2
    #  <fct>  <dbl>  <dbl>  <dbl>  <dbl>
    1 ADV     30.3   30.2   30.2   30.4
    2 HHM     29.7   30.5   29.8   29.9
    3 POV     29.3   30.2   29.9   30.2