Search code examples
rpanel-data

Obtaining descriptive statistics of observations with years of complete data in R


I have the following panel dataset

id year Value
1  1     50
2  1     55
2  2     40
3  1     48
3  2     54
3  3     24
4  2     24
4  3     57
4  4     30

I would like to obtain descriptive statistics of the number of years in which observations have information available, for example: the number of individuals with only one year of information is 1, the number of individuals with only two years of information is one, while the number of individuals with three years of available information is 2.


Solution

  • In base R using table and it's faster cousin tabulate:

    table(tabulate(dat$id))
    
    1 2 3 
    1 1 2 
    

    or

    table(table(dat$id))
    

    Convert to a data.frame:

    data.frame(table(tabulate(dat$id)))
      Var1 Freq
    1    1    1
    2    2    1
    3    3    2