I have a dataframe with multiple columns, but two columns in particular are interesting for me. Column1 contains values 0 and a number (>0) Column2 contains numbers as well.
I want to create 21 new columns containing new information from Column2 given Column1.
So when Column1 is positive (not 0) I want the first new column, Column01, to take the value from Column2 that goes 10 back. and Column02 goes 9 back,.. Column11 is the exact same as Column2 value.. and Column21 is 10 forward.
For example
Column 1 Column2 Columns01 Columns02.. Columns11..Columns20 Columns21
0 5 0 0 0 0 0
0 2 0 0 0 0 0
0 0 0 0 0 0 0
1 3 0 0 3 5 4
0 10 0 0 0 0 0
0 83 0 0 0 0 0
0 2 0 0 0 0 0
0 5 0 0 0 0 0
0 4 0 0 0 0 0
1 8 0 5 8 5 3
0 6 0 0 0 0 0
0 5 0 0 0 0 0
0 55 0 0 0 0 0
0 4 0 0 0 0 0
2 3 10 83 3 5 0
0 2 0 0 0 0 0
0 3 0 0 0 0 0
0 4 0 0 0 0 0
0 5 0 0 0 0 0
0 3 0 0 0 0 0
1 22 6 5 22 0 0
0 12 0 0 0 0 0
0 0 0 0 0 0 0
0 5 0 0 0 0 0
Hope this makes sense to you and you can help.
Here's one way using the newly implemented shift()
function from data.table v1.9.5
:
require(data.table) ## v1.9.5+
setDT(dat) ## (1)
cols = paste0("cols", sprintf("%.2d", 1:21)) ## (2)
dat[, cols[1:10] := shift(Column2, 10:1, fill=0)] ## (3)
dat[, cols[11] := Column2] ## (4)
dat[, cols[12:21] := shift(Column2, 1:10, fill=0, type="lead")] ## (5)
dat[Column1 == 0, (cols) := 0] ## (6)
Assuming dat
is your data.frame, setDT(dat)
converts it to a data.table, by reference (the data is not copied physically to a new location in memory, for efficiency).
Generate all the column names.
Generated lagged vectors of Column2
with periods 10:1
and assign it to the first 10 columns.
11th column is = Column2
.
Generated leading vectors of Column2
with periods 1:10
and assign it to the last 10 columns.
Get indices of all the rows where Column1 == 0
, and replace/reset all newly generated columns for those indices to 0
.
Use setDF(dat)
if you want a data.frame back.
You can wrap this in a function with the values -10:10
and choosing type="lag"
or type="lead"
accordingly, depending on whether the values are negative or positive.. I'll leave that to you.