Search code examples
rdataset

how to move up the values within each group in R


I need to shift valid values to the top the of dataframe withing each id. Here is an example dataset:

df <- data.frame(id = c(1,1,1,2,2,2,3,3,3,3),
                 itemid = c(1,2,3,1,2,3,1,2,3,4),
                 values = c(1,NA,0,NA,NA,0,1,NA,0,NA))
    
df
   id itemid values
1   1      1      1
2   1      2     NA
3   1      3      0
4   2      1     NA
5   2      2     NA
6   2      3      0
7   3      1      1
8   3      2     NA
9   3      3      0
10  3      4     NA

excluding the id column, when there is a missing value in values column, I want to shift all values aligned to the top for each id.

How can I get this desired dataset below?

df1
   id itemid values
1   1      1      1
2   1      2      0
3   1      3     NA
4   2      1      0
5   2      2     NA
6   2      3     NA
7   3      1      1
8   3      2      0
9   3      3     NA
10  3      4     NA

Solution

  • Using tidyverse you can arrange by whether values is missing or not (which will put those at the bottom).

    library(tidyverse)
    
    df %>%
      arrange(id, is.na(values))
    

    Output

          id itemid values
       <dbl>  <dbl>  <dbl>
     1     1      1      1
     2     1      3      0
     3     1      2     NA
     4     2      3      0
     5     2      1     NA
     6     2      2     NA
     7     3      1      1
     8     3      3      0
     9     3      2     NA
    10     3      4     NA
    

    Or, if you wish to retain the same order for itemid and other columns, you can use mutate to specifically order columns of interest (like values). Other answers provide good solutions, such as @Santiago and @ThomasIsCoding. If you have multiple columns of interest to move NA to the bottom per group, you can also try:

    df %>%
      group_by(id) %>%
      mutate(across(.cols = values, ~values[order(is.na(.))]))
    

    where the .cols argument would contain the columns to transform and reorder independently.

    Output

          id itemid values
       <dbl>  <dbl>  <dbl>
     1     1      1      1
     2     1      2      0
     3     1      3     NA
     4     2      1      0
     5     2      2     NA
     6     2      3     NA
     7     3      1      1
     8     3      2      0
     9     3      3     NA
    10     3      4     NA