Search code examples
rsummary

Obtaining summary statistics from unique column combinations in R


I have a file input.txt that I read in as a data frame, and I need to be able to get summary statistics from it as shown in "output". Each ID in the file is unique, the Year and Status are not.

>input
ID  Year    Status
1   2002    OK
2   2002    OK
3   2003    NO
4   2003    OK
5   2007    OK
6   2007    NO

I've tried to use:

table(melt(input, id=c("ID"))

I'm still not getting what I would like. Below is the desired output I would like to get. I need to get the summary of individuals per year that had status OK and NO.

>output 
Year   OK   NO
2002    2   0
2003    1   1
2007    1   1

Can someone please help?


Solution

  • You may try this:

    with(df, table(Year, Status))
    #       Status
    # Year   NO OK
    #   2002  0  2
    #   2003  1  1
    #   2007  1  1
    
    # or 
    library(reshape2)
    dcast(df, Year ~ Status)
    
    #   Year NO OK
    # 1 2002  0  2
    # 2 2003  1  1
    # 3 2007  1  1