Search code examples
rdataframerows

Extract all row.names in a data.frame that match a value in another data.frame


I have a data.frame with 150 column names. For each column, I want to extract the maximum and minimum values (the rows repeat) and the row names of each maximum value. I have extracted the min and max values in another data.frame but don't know how to match them.

I have found functions that are very close for this, like for minimum values:

head(cars)
  speed dist
1     4    2
2     4   10
3     7    4
4     7   22
5     8   16
6     9   10

sapply(cars,which.min)

speed  dist 
    1     1 

Here, it only gives the first index for minimum speed.

And I've tried with loops like:

for (i in (colnames(cars))){
  print(min(cars[[i]]))
}


[1] 4
[1] 2

But that just gives me the minimum values, and not if they are repeated and the rowname of each repeated value.

I want something like:

min.value  column  rowname   freq.times
4          speed      1,2        2
2          dist       1          1


Thanks and sorry if I have orthography mistakes. No native speaker


Solution

  • min.value <- sapply(cars, min)
    columns <- names(min.value)
    row.values <- sapply(columns, \(x) which(cars[[x]] == min.value[which(names(min.value) == x)]))
    freq.times <- sapply(row.values, length)
    row.values <- sapply(row.values, \(x) paste(x, collapse = ","))
    names(min.value) <- names(row.values) <- names(freq.times) <- NULL
    
    data.frame(min.value = min.value,
               columns = columns,
               row.values = row.values, 
               freq.times = freq.times)
    
      min.value columns row.values freq.times
    1         4   speed        1,2          2
    2         2    dist          1          1
    

    Here it is wrapped in function, so that you can use it across whatever data frame and function you need:

    create_table <- function(df, FUN) {
      values <- sapply(df, FUN)
      columns <- names(values)
      row.values <- sapply(columns, \(x) which(df[[x]] == values[which(names(values) == x)]))
      freq.times <- sapply(row.values, length)
      row.values <- sapply(row.values, \(x) paste(x, collapse = ","))
      names(values) <- names(row.values) <- names(freq.times) <- NULL
      
      data.frame(values = values,
                 columns = columns,
                 row.values = row.values, 
                 freq.times = freq.times)
    }
    
    create_table(cars, min)
      values columns row.values freq.times
    1      4   speed        1,2          2
    2      2    dist          1          1
    
    create_table(cars, max)
      values columns row.values freq.times
    1     25   speed         50          1
    2    120    dist         49          1