Search code examples
rsortingdata-cleaningdata-wrangling

How to order data horizontally (column-wise)?


I have data in the following manner. For each category x taking values a, b, or c, y is an integer.

x1 y1 x2 y2 x3 y3 x4 y4
b 7 b 4 c 4 b 1
a 5 a 9 b 1 b 7

I'd like to sort the data in ascending order column-wise (horizontally) when x = "b". Desired output:

x1 y1 x2 y2 x3 y3 x4 y4
b 1 b 4 b 7 c 4
b 1 b 7 a 5 a 9

I tried using a bubble sort algorithm to arrange the y values, but unable to conditionally do it when x = "b". Any help is appreciated!


Solution

  • A tidyr + dplyr way:

    1. pivot_longer to set the data in a tidy format.
    2. Create a factor with ordered levels, with "b" as the first level and other by order of appearance.
    3. Arrange the data accordingly.
    4. Create a new id within each group to match new order.
    5. Pivot back to wider format.
    library(tidyr)
    library(dplyr)
    df %>% 
      mutate(id = 1:nrow(.)) %>% 
      pivot_longer(-id, names_pattern = "(x|y)(\\d+)", names_to = c(".value", "value")) %>% 
      mutate(x = factor(x, levels = c("b", setdiff(unique(x), "b")), ordered = TRUE)) %>% 
      arrange(id, x, y) %>% 
      mutate(value = data.table::rowid(id)) %>% 
      pivot_wider(names_from = "value", values_from = c("x", "y"), names_sep = "", names_vary = "slowest")
    
    #   id x1 y1 x2 y2 x3 y3 x4 y4
    # 1  1  b  1  b  4  b  7  c  4
    # 2  2  b  1  b  7  a  5  a  9