How can we make cuts of 1 variable ensuring that the sum of another variable for these cuts is even?
eg.
I would like the sum of var2 to be more even between cuts
code:
library(data.table)
dt = data.table(var1=c(0.6,0.2,0.5,0.8,0.10,0.1,0.2,0.5,0.3,0.5),
var2=c(20,400,350,50,100,490,1200,900,1850,70))
dt[,cuts:=cut(dt$var1,breaks=3)]
dt[,.(var2=sum(var2)),by=cuts]
Tks!
One way would be to create a vector that has your var1
values represented in proportion to their var2
values, and then use that vector to create equal bins, for example,
library(data.table)
library(Hmisc)
dt = data.table(var1=c(0.6,0.2,0.5,0.8,0.10,0.1,0.2,0.5,0.3,0.5),
var2=c(20,400,350,50,100,490,1200,900,1850,70))
dt[,var3:=round(var2/min(var2))]
cc = rep(dt[,var1], dt[,var3])
labs = cut2(cc, g=3, onlycuts = TRUE)
dt[,cuts:=cut2(var1, cuts=labs)]
dt[,.(var2=sum(var2)),by=cuts]
# cuts var2
# 1: [0.5,0.8] 1390
# 2: [0.1,0.3) 2190
# 3: 0.3 1850