I am trying to compute the interpolated median by group for a number of variables. My dataframe looks like this:
# A tibble: 6 x 8
id eu_image eu_insurance eurobonds free_movement_welfare eu_cn_solidarity country_code country_party_mass
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+lbl> <chr>
1 CAWI200000100 4 4 4 3 3 2 germany_7
2 CAWI300000784 2 2 1 1 1 3 italy_9
3 CAWI100000787 3 3 2 2 3 1 france_13
4 CAWI500000081 3 2 2 1 3 5 spain_2
5 CATI500000067 4 3 2 2 6 5 spain_3
6 CAWI100000398 2 4 4 2 5 1 france_2
When I run the following code to compute the interpolated mean by the grouping variable country_party_mass:
party_median <- newdata %>%
group_by(country_party_mass) %>%
dplyr::summarise_at(c( "eu_image",
"eu_cn_solidarity",
"eurobonds",
"free_movement_welfare",
"eu_insurance"),
funs(interp.median(., na.rm=TRUE))) %>%
as.data.frame()
I get the following error:
Error in summarise_impl(.data, dots) : Column
eu_cn_solidarity
must be length 1 (a summary value), not 0
I have checked previous questions on similar issues, but I could not find a viable solution.
Building on A. Suliman's comment:
you can add an ifelse
function to check if all entries are NA
:
party_median <- newdata %>%
group_by(country_party_mass) %>%
dplyr::summarise_at(vars(c("eu_image",
"eu_cn_solidarity",
"eurobonds",
"free_movement_welfare",
"eu_insurance")),
~ifelse(all(is.na(.)), NA_real_, interp.median(., na.rm=TRUE)))
Note that the funs
function is now soft deprecated (as of dplyr 0.8.0.1) so I use the "~" notation instead. Also I use the vars
function to select variables.