I have made a dataframe which has a column with dates and columns with numeric values. I want this dataframe to group itself by month and summerize all the numeric values from the other columns per corresponding month.
Here is my dataframe example:
capture.date Test1 Test2 Test3
2016-03-18 0 1 1
2016-03-18 1 1 1
2016-03-20 2 1 1
2016-04-12 1 0 1
I already tried some code:
df %>%
group_by(capture.date) %>%
summarise_each(funs(sum))
and:
aggregate(df[2:4], by=df["capture.date"], sum)
but both of these options return dataframes which summarize by daily date instead of month. How can I make it summarize by month instead of by day?
desired output:
capture.date Test1 Test2 Test3
2016-03 3 3 3
2016-04 1 0 1
You can extract dates into the %Y-%m
format in group_by()
and use summarise_if()
or summarise_at()
to select which variables get be summed.
(Confirm that capture.date
is Date
class)
df %>%
group_by(Date = strftime(capture.date, "%Y-%m")) %>%
summarise_if(is.numeric, sum)
# # A tibble: 2 x 4
# Date Test1 Test2 Test3
# <chr> <int> <int> <int>
# 1 2016-03 3 3 3
# 2 2016-04 1 0 1
Scoped verbs (_if
, _at
, _all
) have been superseded by the use of pick()
or across()
in an existing verb.
df %>%
group_by(Date = strftime(capture.date, "%Y-%m")) %>%
summarise(across(where(is.numeric), sum))