I have a data like this:
This is just a fake data I create:
# dt
Col1 Col2 Col3 Col4
2014/1/1 A 10 1
2014/4/1 A 15 1.5
2015/1/1 A 15 3
2015/4/1 A 30 4
2014/1/1 B 20 2
2014/4/1 B 30 6
2015/1/1 B 40 10
2015/4/1 B 80 16
What I want:
Col1 Col2 Col3 Col4 Col3.R Col4.R
2014/1/1 A 10 1 1 1
2014/4/1 A 15 1.5 1.5 1.5
2015/1/1 A 15 3 1.5 3
2015/4/1 A 30 4 3 4
2014/1/1 B 20 2 1 1
2014/4/1 B 30 6 3/2 3
2015/1/1 B 40 10 2 5
2015/4/1 B 80 16 4 8
The new column Col3.R
is calculated by the value of col3
under each group Col2
divide the first value in each group. Same as col4.R
.
I try the code below:
dt[, sapply(.SD, function(x) R = x / x[1]), .SDcols = 3:4, by = .(Col2)]
How to keep the original columns? Do I need to use argument on
for data.table
?
Data:
dt <- fread(" Col1 Col2 Col3 Col4
2014/1/1 A 10 1
2014/4/1 A 15 1.5
2015/1/1 A 15 3
2015/4/1 A 30 4
2014/1/1 B 20 2
2014/4/1 B 30 6
2015/1/1 B 40 10
2015/4/1 B 80 16", header = T)
dt$Col3 <- as.numeric(dt$Col3)
Use lapply
and paste0
to create new columns
library(data.table)
dt[, paste0("col", 3:4, ".R") := lapply(.SD,
function(x) x / x[1]), .SDcols = 3:4, by = .(Col2)]
dt
# Col1 Col2 Col3 Col4 col3.R col4.R
#1: 2014/1/1 A 10 1.0 1.0 1.0
#2: 2014/4/1 A 15 1.5 1.5 1.5
#3: 2015/1/1 A 15 3.0 1.5 3.0
#4: 2015/4/1 A 30 4.0 3.0 4.0
#5: 2014/1/1 B 20 2.0 1.0 1.0
#6: 2014/4/1 B 30 6.0 1.5 3.0
#7: 2015/1/1 B 40 10.0 2.0 5.0
#8: 2015/4/1 B 80 16.0 4.0 8.0