I am looking to move specific rows in my dataframe, accordingly to one column.
df <- read.table(text = 'ID Day Count
33012 9526 4
35004 9526 4
37006 9526 4
37008 9526 4
21009 1913 3
24005 1913 3
25009 1913 3
29002 12551 2
30001 12551 2
25009 14329 1
48007 9525 0
49002 1912 0
51003 12550 0
56001 12550 0', header = TRUE)
Given the above dataframe, I would like move the rows with 0s in the Count column accordingly to the column Day, i.e. day 9525 after 9526, 1912 after 1913, 12550 after 12551.
The output should be:
ID Day Count
33012 9526 4
35004 9526 4
37006 9526 4
37008 9526 4
48007 9525 0
21009 1913 3
24005 1913 3
25009 1913 3
49002 1912 0
29002 12551 2
30001 12551 2
51003 12550 0
56001 12550 0
25009 14329 1
Please ignore the ID column.
New question related to this topic:
I think I found a really cool solution to your updated question:
df[order(match(df$Day+(z <- df$Count==0L),unique(df$Day[!z])),z),];
## ID Day Count
## 1 33012 9526 4
## 2 35004 9526 4
## 3 37006 9526 4
## 4 37008 9526 4
## 11 48007 9525 0
## 5 21009 1913 3
## 6 24005 1913 3
## 7 25009 1913 3
## 12 49002 1912 0
## 8 29002 12551 2
## 9 30001 12551 2
## 13 51003 12550 0
## 14 56001 12550 0
## 10 25009 14329 1
This solution orders by two things:
1: First, it orders by the "canonical" Day
value. The canonical Day
value is taken as df$Day
for non-zero Count
rows, and df$Day+1L
for zero Count
rows. This is accomplished by matching the canonical Day
values into a vector of unique canonical Day
values of the non-zero Count
rows, which also functions to preserve the incoming order of the canonical Day
values. The canonical Day
values are computed using logical addition, which treats FALSE
as zero and TRUE
as one. The zero/non-zero distinction is captured in a local variable z
on-the-fly, absolving the need for subsequent redundant computation of that information.
2: Second, it orders the non-zero Count
rows before the zero Count
rows. Since z
has already been computed earlier in the line, we can simply pass it as the second argument to order()
to do this. When ordering by a logical vector, FALSE
is ordered before TRUE
, thus it works directly.
I think this is what you're looking for:
df$vl <- ave(df$vl,df$id,FUN=function(x) sort(decreasing=T,x));
df;
## id vl
## 1 C 5
## 2 C 3
## 3 C 2
## 4 C 2
## 5 A 5
## 6 A 5
## 7 A 4
## 8 A 2
## 9 B 4
## 10 B 2
## 11 B 1
## 12 B 1
The above sorts the vl
column within each id
group, independently of the other id
groups.
Data
set.seed(1L);
df <- data.frame(id=rep(c('C','A','B'),each=4L),vl=sample(5L,12L,T));
Another interpretation of your question is that you want to sort the entire data.frame by the vl
column, but within each unique value of vl
, you want to prefer the order in which the unique values in the id
column occur in the original data.frame (notwithstanding the fact that not all id
values are represented alongside each unique vl
value). Here's how that can be done:
df[order(-df$vl,match(df$id,unique(df$id))),];
## id vl
## 1 C 5
## 5 A 5
## 6 A 5
## 7 A 4
## 9 B 4
## 2 C 3
## 3 C 2
## 4 C 2
## 8 A 2
## 10 B 2
## 11 B 1
## 12 B 1