My data frame has the following four columns: type("A" or "B"), xvar, longitude, and latitude. It looks like:
type xvar longitude latitude
[1,] A 20 -87.81 40.11
[2,] A 12 -87.82 40.12
[3,] A 50 -87.85 40.22
....
[21,] B 24 -87.79 40.04
[22,] B 30 -87.88 40.10
[23,] B 12 -87.67 40.32
[24,] B 66 -87.66 40.44
....
I have 20 rows for type="A", and 25,000 rows for type="B". My task is to randomly assign the values of xvar for 20 "A" data points onto the X-Y space of type "B" without replacement. For example, the xvar=20 as in the first observation of type="A" can be randomly located in [22,] that is (-87.88,40.10) . Because I am doing that without replacement, in theory, I can do this replication 25,000/20 = 1,250 times. I want a 1,000 replication.
And I have a function (say, myfunc(xvar,longitude,latitude)) that returns one statistical value from one randome sample. I first create an empty matrix (say, myresult) of 1,000x1.
myresult <- array(0,dim=c(1000,1))
Then, for each random sample, I apply my function (myfunc) to calculate the statistic.
for (i in seq(1:1000)) {
draw one sample, that has three variables: xvar, longitude, latitude.
apply my function to this selected sample.
store the calculated statistic in the myresult[i,]
}
I wonder how to do this in R. (And may be in Matlab??) Thanks!
=============================================================
Update: @user. Borrowing your idea, the following is what I want:
dd1 <- df[df$type == "B" ,]
dd2 <- df[df$type == "A" ,]
v <- dd2[sample(nrow(dd2), nrow(dd2)), ]
randomXvarOfA <- as.matrix(v[,c("xvar")])
cols <- c("longitude","latitude")
B_shuffled_XY <- dd1[,cols][sample(nrow(dd1), nrow(dd2)), ]
dimnames(randomXvarOfA)=list(NULL,c("xvar"))
sampledData <- cbind(randomXvarOfA,B_shuffled_XY)
sampledData
xvar longitude latitude
4 20 -87.79 40.04
7 12 -87.66 40.44
5 50 -87.88 40.10
Read in your data:
df<- read.table( text="
type xvar longitude latitude
A 20 -87.81 40.11
A 12 -87.82 40.12
A 50 -87.85 40.22
B 24 -87.79 40.04
B 30 -87.88 40.10
B 12 -87.67 40.32
B 66 -87.66 40.44", header = TRUE)
I was writing this without splitting and it looked so messy.
So I decided just to split your data.frame
.
dd1 <- df[df$type == "B" ,] # get all rows of just type A
dd2 <- df[df$type == "A" ,] # get all rows of just type B
v <- dd2[sample(nrow(dd2), 2), ] #sample two rows at random that are type A
# if you want to sample 20 rows change the 2 to a 20
cols <- c("longitude", "latitude")
dd1[,cols][sample(nrow(dd1), 2), ] <- v[,cols]
#Add the random long/lat selected from type As into 2 random long/lat of B
# put the As and Bs back together
rbind(dd2,dd1)
# type xvar longitude latitude
# 1 A 20 -87.81 40.11
# 2 A 12 -87.82 40.12
# 3 A 50 -87.85 40.22
# 4 B 24 -87.79 40.04
# 5 B 30 -87.85 40.22
# 6 B 12 -87.81 40.11
# 7 B 66 -87.66 40.44
As you can see rows 5 and 6 of B have new randomly selected lat and long values from A types. I did not change the xvar
values though. I don't know if you want this. If you did want to change the xvars
too then you would change cols
to cols <- c("xvar","longitude", "latitude")
.
Inside a function it would look like:
changestuff <- function(x){
dd1 <- x[x$type == "B" ,] # get just A
dd2 <- x[x$type == "A" ,] # get just B
v <- dd2[sample(nrow(dd2), 2), ]
cols <- c("longitude", "latitude")
dd1[,cols][sample(nrow(dd1), 2), ] <- v[,cols]
rbind(dd2,dd1)
}
changestuff(df)