I have a dataset:
structure(list(num = c(12L, 12L), code = structure(1:2, .Label = c("a",
"b"), class = "factor"), ranking = c(2414.5, 2414.5), bottom = c(-0.0153795976572657,
-0.00651997615327495), previous = c(0.00121016455715892,
-0.000166609624290187), of_all_previous = c(-0.000570973882726524,
-0.000771377162183913)), row.names = c(NA, -2L), class = "data.frame")
I would like to collapse the two rows according to num
and ranking
since they are the same but change code
according to conditions where the two rows' values for columns bottom
, previous
, and of_all_previous
should be compared sequentially (meaning: if they are equal then move on to the next column as in bottom
-> previous
-> of_all_previous
) and choose the code
that has the higher value.
In the case of the sample data provided, code
b
because -0.0065199761532749503 > -0.0153795976572656777
in the bottom
column.
If they were equal then previous
columns would have to be looked at.
I think that maybe dplyr
can use the %>%
to collapse rows but I cannot find how to change rows at the same time according to conditionality.
I expect the output to look something like this:
num code ranking
1 12 a 2414
Thanks in advance.
One approach might be to group_by
both num
and ranking
to look at rows that have same values and need to be collapsed.
Then, would arrange
(sort) in descending order the columns where comparisons are made, including bottom
previous
and of_all_previous
. When this is done, the higher row in the sorted data will consider the columns in order for comparison when there are ties.
Finally, you can create a final_code
with the first(code)
based on the sorting.
library(tidyverse)
df %>%
group_by(num, ranking) %>%
arrange(-bottom, -previous, -of_all_previous) %>%
mutate(final_code = ifelse(n() > 1, as.character(first(code)), as.character(code))) %>%
slice(1) %>%
select(num, code, ranking)