Search code examples
rdatasetsubsettapply

How to subset multiple columns condition in R?


All,

My dataset looks like following. I am trying to answer below question.

Question:

Based on Drawing paper data ONLY, does the stores sells more units (units.sold column) of one paper subtype(paper.type) than others ?

To answer above question I used tapply function where I was able to filter data for both papers. Now I am not sure how to proceed further to get only Drawing paper data. Any help is appreciated!

My code

tapply(df$units.sold,list(df$paper,df$paper.type,df$store),sum)

Dataset

             date year     rep     store paper          paper.type  unit.price   units.sold total.sale
9991  12/30/2015 2015     Ran    Dublin watercolor      sheet       0.77          5       3.85
9992  12/30/2015 2015     Ran    Dublin    drawing       pads      10.26          1      10.26
9993  12/30/2015 2015  Arijit  Syracuse watercolor        pad      12.15          2      24.30
9994  12/30/2015 2015  Thomas Davenport    drawing       roll      20.99          1      20.99
9995  12/31/2015 2015   Ruisi    Dublin watercolor      sheet       0.77          7       5.39
9996  12/31/2015 2015   Mohit Davenport    drawing       roll      20.99          1      20.99
9997  12/31/2015 2015    Aman  Portland    drawing       pads      10.26          1      10.26
9998  12/31/2015 2015 Barakat  Portland watercolor      block      19.34          1      19.34
9999  12/31/2015 2015  Yunzhu  Syracuse    drawing    journal      24.94          1      24.94
10000 12/31/2015 2015    Aman  Portland watercolor      block      19.34          1      19.34

Note: I am new to R.Please provide explanation along with your code.


Solution

  • You could start by taking aggregate of unit.sold column based on store and paper.type

    aggregate(units.sold~store+paper.type, df[df$paper == "drawing", ], sum)
    
    #      store paper.type units.sold
    #1  Syracuse    journal          1
    #2    Dublin       pads          1
    #3  Portland       pads          1
    #4 Davenport       roll          2
    

    Here we filter the data for only "drawing" type of paper. We can compare the number of units.sold for each store and paper.type based on this output.