Search code examples
rreshapelubridatemelt

Getting mean value of set of variables after grouping by date using R


Here is my dataset: https://app.box.com/s/x5eux7mhdc0geyck4o47ttmpynah0wqk

Snapshot:

enter image description here

I'd like a create a data frame in which the average value of the sentiments would be present in a group of 2 months.

I tried the following code:

sentiment_dataset$created_at <- ymd_hms(sentiment_dataset$created_at)

sentiment_time <- sentiment_dataset %>% 
  group_by(created_at = cut(created_at, breaks="2 months")) %>%
          summarise(negative = mean(negative),
                    positive = mean(positive)) %>% melt

It gave the following error:

Using created_at as id variables Error in match.names(clabs, names(xi)) : names do not match previous names


Solution

  • I'm not sure you can create the grouping variable in the group_by statement. Looks like using mutate beforehand works, though.

    library(dplyr)
    library(tidyr)
    
    sentiment_time <- sentiment_dataset %>%
      mutate(created_at = cut(created_at, breaks="2 months")) %>%
      group_by(created_at) %>%
      summarize(negative = mean(negative),
                positive = mean(positive)) %>%
      gather('sentiment', 'mean_value', negative, positive)