Search code examples
rdplyrdata-analysis

Add/Merge/Melt only specific columns and give out one unique row


I am trying to transform a dataset that has multiple product sales on a date. At the end I want to keep only unique columns with the sum of the product sales per day.

My MRE:

df <- data.frame(created = as.Date(c("2020-01-01", "2020-01-01", "2020-01-02", "2020-01-02", "2020-01-03", "2020-01-03"), "%Y-%m-%d", tz = "GMT"),
                soldUnits = c(1, 1, 1, 1, 1, 1),
Weekday = c("Mo","Mo","Tu","Tu","Th","Th"),
Sunshinehours = c(7.8,7.8,6.0,6.0,8.0,8.0))

Which looks like this:

       Date   soldUnits Weekday Sunshinehours
     2020-01-01   1       Mo      7.8
     2020-01-01   1       Mo      7.8
     2020-01-02   1       Tu      6.0
     2020-01-02   1       Tu      6.0
     2020-01-03   1       We      8.0
     2020-01-03   1       We      8.0

And should look like this after transforming:

       Date   soldUnits Weekday Sunshinehours
     2020-01-01   2      Mo      7.8
     2020-01-02   2      Tu      6.0
     2020-01-03   2      We      8.0
     

I tried aggregate() and group_by but without success because my data was dropped.

Is there anyone who has an idea, how i can transform and clean up my dataset according to the specifications i mentioned?


Solution

  • This can work:

    library(tidyverse)
    
    df %>%
      group_by(created) %>% 
      count(Weekday, Sunshinehours, wt = soldUnits,name = "soldUnits")
    #> # A tibble: 3 × 4
    #> # Groups:   created [3]
    #>   created    Weekday Sunshinehours soldUnits
    #>   <date>     <chr>           <dbl>     <dbl>
    #> 1 2020-01-01 Mo                7.8         2
    #> 2 2020-01-02 Tu                6           2
    #> 3 2020-01-03 Th                8           2
    

    Created on 2021-12-04 by the reprex package (v2.0.1)