I have a data frame that looks generally like this
df.data <- data.frame(x=sample(1:9, 10, replace = T), y=sample(1:9, 10, replace=T), vx=sample(-1:1, 10, replace=T), vy=sample(-1:1, 10, replace=T))
x and y are positions. vx and vy are x, y values for a 2d vector. I want to take this data frame and "bin" based on the x and y values, but performing a calculation on the vx and vy. This function does this except it uses a loop which is going to be too slow for my data set.
slowWay <- function(df)
{
df.bin <- data.frame(expand.grid(x=0:3, y=0:3, vx=0, vy=0, count=0))
for(i in 1:nrow(df))
{
x.bin <- floor(df[i, ]$x / 3)
y.bin <- floor(df[i, ]$y / 3)
print(c(x.bin, y.bin))
df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$vx = df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$vx + df[i, ]$vx
df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$vy = df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$vy + df[i, ]$vy
df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$count = df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$count + 1
}
return(df.bin)
}
Is this type of 2D binning possible in a non looping way?
Here's another faster way to do it, one that includes unpopulated bin combinations:
fasterWay <- function(df.data) {
a1 <- aggregate(df.data[,3:4], list(x=floor(df.data$x/3), y=floor(df.data$y/3)), sum)
a2 <- aggregate(list(count=rep(NA,nrow(df.data))), list(x=floor(df.data$x/3), y=floor(df.data$y/3)), length)
result <- merge(expand.grid(y=0:3,x=0:3), merge(a1,a2), by=c("x","y"), all=TRUE)
result[is.na(result)] <- 0
result <- result[order(result$y, result$x),]
rownames(result) <- NULL
result
}
It gives me:
x y vx vy count
1 0 0 0 0 1
2 0 1 0 0 0
3 0 2 -1 -1 1
4 0 3 0 0 0
5 1 0 -1 -1 1
6 1 1 0 0 0
7 1 2 0 0 0
8 1 3 -1 0 2
9 2 0 -1 -1 1
10 2 1 0 0 0
11 2 2 -1 1 2
12 2 3 0 0 1
13 3 0 0 0 0
14 3 1 0 0 0
15 3 2 -1 0 1
16 3 3 0 0 0