Search code examples
rsubsetsampling

How to write the remaining data frame in R after randomly subseting the data


I took a random sample from a data frame. But I don't know how to get the remaining data frame.

df <- data.frame(x=rep(1:3,each=2),y=6:1,z=letters[1:6])

#select 3 random rows
df[sample(nrow(df),3)]

What I want is to get the remaining data frame with the other 3 rows.


Solution

  • sample sets a random seed each time you run it, thus if you want to reproduce its results you will either need to set.seed or save its results in a variable.

    Addressing your question, you simply need to add - before your index in order to get the rest of the data set. Also, don't forget to add a comma after the indx if you want to select rows (unlike in your question)

    set.seed(1)
    indx <- sample(nrow(df), 3)
    

    Your subset

    df[indx, ] 
    #   x y z
    # 2 1 5 b
    # 6 3 1 f
    # 3 2 4 c
    

    Remaining data set

    df[-indx, ]
    #   x y z
    # 1 1 6 a
    # 4 2 3 d
    # 5 3 2 e