Search code examples
rdata-munging

Sorting data in a dataframe in R


After data munging and using spread, I arrived at the following table: Complaint types and Boroughs

I would like to identify the top 4 issues in each Borough. Sort does not help since there are 4 Boroughs. Any thoughts on how to get?


Solution

  • You can subset the complaint type column with order(column, decreasing=TRUE)[1:4]. It will return the greatest four values in the vector. It is then easy to convert that to whatever form is needed; here a data frame makes sense:

    lst <- lapply(df[-1], function(col) df[,'Complaint.Type'][order(col, decreasing=T)[1:4]])
    as.data.frame(lst)
    #     BRONX BROOKLYN MANHATTAN   QUEENS
    #1 Facility Facility     Adopt Facility
    #2    Abuse    Abuse  Advocate    Adopt
    #3     Park      Air      Park     Park
    #4 Advocate    Adopt     Abuse Advocate
    

    Data

    df <- data.frame(Complaint.Type=c('Adopt', 'Advocate', 'Air', 'Abuse', 'Facility','Park'),
                     BRONX=c(0,5, 1, 33, 81, 7),
                     BROOKLYN=c(2,0,100,148,177, 1),
                     MANHATTAN=c(129,49,2,9,1,15),
                     QUEENS=c(50,3,0,3,2469,6))