Search code examples
rranking

Replacing each value in a vector with its rank number for a data.frame


In this hypothetical scenario, I have performed 5 different analyses on 13 chemicals, resulting in a score assigned to each chemical within each analysis. I have created a table as follows:

---- Analysis1 Analysis2 Analysis3 Analysis4 Analysis5 Chem_1 3.524797844 4.477695034 4.524797844 4.524797844 4.096698498 Chem_2 2.827511555 3.827511555 3.248136118 3.827511555 3.234398548 Chem_3 2.682144761 3.474646298 3.017780505 3.682144761 3.236152242 Chem_4 2.134137304 2.596921333 2.95181339 2.649076603 2.472875191 Chem_5 2.367736454 3.027814219 2.743137896 3.271122346 2.796607809 Chem_6 2.293110565 2.917318708 2.724156207 3.293110565 2.530967343 Chem_7 2.475709113 3.105794018 2.708222528 3.475709113 3.088819908 Chem_8 2.013451822 2.259454085 2.683273938 2.723554966 2.400976121 Chem_9 2.345123123 3.050074893 2.682845391 3.291851228 2.700844104 Chem_10 2.327658894 2.848729452 2.580415233 3.327658894 2.881490893 Chem_11 2.411243882 2.98131398 2.554456095 3.411243882 3.109205453 Chem_12 2.340778276 2.576860244 2.549707035 3.340778276 3.236545826 Chem_13 2.394698249 2.90682524 2.542599327 3.394698249 3.12936843

I would like to create columns corresponding to each analysis which contain the rank position for each chemical. For instance, under Analysis1,Chem_1 would have value "1", Chem_2 would have value "2", Chem_3 would have value "4", Chem_7 would have value "4", Chem_11 would have value "5", and so on.


Solution

  • We can use dense_rank from dplyr

    library(dplyr)
    df %>%
         mutate_each(funs(dense_rank(-.))) 
    

    In base R, we can do

    df[] <- lapply(-df, rank, ties.method="min")
    

    In data.table, we can use

    library(data.table)
    setDT(df)[, lapply(-.SD, frank, ties.method="dense")]
    

    To avoid the copies from multiplying with -, as @Arun mentioned in the comments

    lapply(.SD, frankv, order=-1L, ties.method="dense")