Search code examples
rr-factor

Create factor based on multiple column values


I am trying to create a new column (factor) that holds the col name of the largest value in a dataframe. Think of this as the proportion of soil type for each polygon (rows) in a dataset. I want to create a new column that holds only the highest proportion soil name. Example:

soil <- data.frame(soil1=c(0.75,0.25,0.25),soil2=c(0.25,0.75,0.75))

Now I want output that looks like this:

soil$out <- c('soil1','soil2', 'soil2')

Solution

  • You can use apply for that matter:

    soil$out <- names(soil)[apply(soil, 1, which.max)]
    

    The apply(soil, 1, which.max) determines which columns holds the (first) maximum value and that is then passed to names(soil) to determine the corresponding column name.