Search code examples
rggplot2group-bydifference

ggplot differences between groups


I have a df with groups in different trials, and I want to make a bar graph of just deltas between trials in ggplot. Having a hard time getting ggplot to understand I want the differences in one df. Also, some of the treatments aren't represented in the second trial, so I want to just count that as 0 (i.e. delta would be = trial 1 - 0).

 set.seed(1)

 df <- data.frame((matrix(nrow=175,ncol=4)))
 colnames(df) <- c("group","trial","count","hour")
 df$group <- rep(c("A","B","C","D","A","B","D"),each=25)
 df$trial <- rep(c(rep(1,times=100),rep(2,times=75)))
 df$count <- runif(175,0,50)
 df$hour <- rep(1:25,times=7)


 df2 <- aggregate(df[,3:4],list(df$group,df$trial),mean)
 colnames(df2)[1:2] <- c("group","trial") 

That's where I've gotten to. I have plotted with individual bars for (group*trial), but I can't figure out how to subtract them. I want a plot of x=group and y= delta(trial).

I tried this:

 ggplot(df2 %>% group_by(group) %>% delta=diff(count),
   aes(x=group,y=delta)) + geom_bar()

from a similar posting I came across, but no luck.


Solution

  • this should do the trick:

    ggplot(df2 %>% group_by(group) %>% summarise(delta=ifelse(nrow(.)>1,diff(count),0)),
           aes(x=group,y=delta)) + geom_col()#geom_bar(stat="identity")
    

    The problems are, that "diff" returns not the value 0 but a vector of length 0, when there is only one input value. Also instead of using geom_bar, I recommend geom_col. Another thing, you should think about, is that the diff result is depending on the order of your data frame. As such I would recommend to use

    ggplot(df2 %>% group_by(group) %>% summarise(delta_trial_1_trial_2=
                                               ifelse(length(trial)>1,
                                                      count[trial==2]-count[trial==1],0)),
       aes(x=group,y=delta_trial_1_trial_2)) + geom_col()