Search code examples
rdataframedplyrsum

Why is my sum() summing to zero, having no NAs and only numerical data?


Once again with a very simple question.

I am trying to add all emissions together, basically summing 5 variables per row.

However, it keeps summing to zero, even when I have no NAs and only numeric values.

This is the data I am working with:

df_structure <-
  structure(
    list(
      `Particeles_PM10_[kg]_WTW_whole transport chain` = c(
        0.000440486,
        0.010753239,
        0.0005393157,
        0.0107265319,
        0.200272577,
        0.169998242
      ),
      `SO2_[kg]_WTW_whole transport chain` = c(
        0.0034873728,
        0.1861534833,
        0.01613152798,
        0.185923214,
        3.715316736,
        3.155906431
      ),
      `NOX_[kg]_WTW_whole transport chain` = c(
        0.024214311,
        0.618727269,
        0.053631226,
        0.617528662,
        12.271221,
        10.3988076
      ),
      `NMHC_[kg]_WTW_whole transport chain` = c(
        0.0043159575,
        0.0385331658,
        0.0033238124,
        0.038634107,
        0.7067915367,
        0.59608807
      )
    ),
    row.names = c(NA,-6L),
    class = c("tbl_df", "tbl", "data.frame")
  )

And heres my code:

df_structure %>%
  rowwise() %>% 
  mutate(sum_emissions = sum(as.numeric("Particeles_PM10_[kg]_WTW_whole transport chain",
                         "SO2_[kg]_WTW_whole transport chain",
                         "NOX_[kg]_WTW_whole transport chain",
                         "NMHC_[kg]_WTW_whole transport chain"), na.rm = TRUE)) 
summary(df_structure$sum_emissions)

What am I doing wrong? I can open my data.frame and every column has 5 rows of filled-in data, yet the sum keeps being 0...

Thanks in advance!


Solution

  • You need to specify that it is a vector of variables using c() and ``'s. As you output is already numeric, you won't need to specify that.

    df_structure %>%
      rowwise() %>% 
      mutate(sum_emissions = sum(c(`Particeles_PM10_[kg]_WTW_whole transport chain`,
                                     `SO2_[kg]_WTW_whole transport chain`,
                                     `NOX_[kg]_WTW_whole transport chain`,
                                     `NMHC_[kg]_WTW_whole transport chain`), na.rm = TRUE)) %>%
      ungroup()
    

    A simpler way might be to use c_across:

    df_structure %>%
      rowwise() %>% 
      mutate(sum_emissions = sum(c_across(1:4), na.rm = TRUE)) %>%
      ungroup()
    

    A base solution is to use rowSums directly (or through mutate):

    df_structure$sum_emissions <- rowSums(df_structure)