Search code examples
rdatatabledata-cleaningfrequency

r create observations from frequency counts


I have frequency counts based on three variables y , Col1, Col2 as shown below

     Col1    Col2      y       n
     Good    Poor      0       0
     Good    Poor      1       0
     Good    Rich      1       13
     Good    Rich      0       8
     Bad     Poor      0       8
     Bad     Poor      1       0
     Bad     Rich      1       15
     Bad     Rich      0       5

How do I expand this table such that the dataset has number of rows, as indicated in column n for combination of responses in Col1, Col2 & y ?

For example the dataset should have 13 rows of Col1=Good, Col2=Rich, y=1, 8 rows of Col1=Good, Col2=Rich, y=0 so on.


Solution

  • You could use uncount:

    tidyr::uncount(df,n)
    
       Col1 Col2 y
    1  Good Rich 1
    2  Good Rich 1
    3  Good Rich 1
    4  Good Rich 1
    5  Good Rich 1
    6  Good Rich 1
    7  Good Rich 1
    8  Good Rich 1
    9  Good Rich 1
    :   :    :   :
    :   :    :   :
    

    The question is why do you need this? You do realize you can still analyze the data the way it is before the counts. What if there were millions of counts for each row? It will not be wise to uncount the data.