Search code examples
rshufflesample

How to shuffle rows of a given number of columns in a data frame in R?


How can I shuffle the rows of two columns of a data frame object in R, keeping the rest of the columns the same? The shuffled rows of the columns should be linked, this means that both row values need to be shuffled together, not independently.

y <- c(18, 0, 2, 0,  0,  0,  2,  0,  0,  1,  7,  0,  0,  0,  0,  0,  0,  0,  0)
x1 <- c(501, 1597, 1156, 1134, 1924,  507, 1022,  0,  92, 1729, 85, 963, 544, 1315, 2250, 1366,  458,  385,  930)
x2 <- c(0,  92,  959, 1146,  900,  0,  276, 210,  980, 8, 0, 473, 0, 255, 1194, 542, 983, 331,  923)
offset_1 <- c(59, 34, 33, 35, 60, 58, 59, 33, 34, 61, 58, 58, 55, 26, 26, 18, 26, 26, 26)
data_1 <- data.frame(y,x1,x2,offset_1)

In this example I want to shuffle the rows of 'y' and 'offset_1' together, keeping the rest of the columns the same.


Solution

  • You can reshuffle the rows using sample on the selected columns:

    set.seed(1)
    cols = c("y", "offset_1")
    data_1[cols] <- data_1[sample(nrow(data_1)), cols]
    

    output

    #     y   x1   x2 offset_1
    # 1   0  501    0       35
    # 2   2 1597   92       59
    # 3  18 1156  959       59
    # 4   0 1134 1146       34
    # 5   0 1924  900       55
    # 6   0  507    0       26
    # 7   7 1022  276       58
    # 8   0    0  210       18
    # 9   0   92  980       26
    # 10  2 1729    8       33
    # 11  0   85    0       26
    # 12  0  963  473       60
    # 13  0  544    0       33
    # 14  0 1315  255       58
    # 15  0 2250 1194       58
    # 16  1 1366  542       61
    # 17  0  458  983       34
    # 18  0  385  331       26
    # 19  0  930  923       26