I have a dataset with donations made to different politicians where each row is a specific donation.
donor.sector <- c(sector A, sector B, sector X, sector A, sector B)
total <- c(100, 100, 150, 125, 500)
year <- c(2006, 2006, 2007, 2007, 2007)
state <- c(CA, CA, CA, NY, WA)
target_specific <- c(politician A, politician A, politician A, politician B, politician C)
dat <- as.data.frame(donor.sector, total, year, target_specific, state)
I'm trying to get a year mean of donations for each politician. And I'm able to do so by doing the following:
library(dplyr)
new.df <- dat%>%
group_by(target_specific, year)%>%
summarise(mean= mean(total))
My issue is that since I'm grouping this the outcome only has three variables: mean, year and target specific. Is there a way by which I can do this and create a new data frame where I keep the politician level variables, such as state?
Many thanks!
There are two ways in which you can do that :
Include the additional variables in group_by
:
library(dplyr)
dat%>%
group_by(target_specific, year, state)%>%
summarise(mean= mean(total))
# target_specific year state mean
# <chr> <dbl> <chr> <dbl>
#1 politician A 2006 CA 100
#2 politician A 2007 CA 150
#3 politician B 2007 NY 125
#4 politician C 2007 WA 500
Or keeping the same group_by
structure you can include the first
value of additional variable.
dat%>%
group_by(target_specific, year)%>%
summarise(mean= mean(total), state = first(state))