Hi I am working with the Python datatable package and need to replace all the 'NA' after joining two DT's.
Sample data:
DT = data.table(x=rep(c("b","a","c"),each=3), y=c(1,3,6), v=1:9)
X = data.table(x=c("c","b"), v=8:7, foo=c(4,2))
X[DT, on="x"]
The code below replaces all 1 with 0
DT.replace(1, 0)
How should I adapt it to replace 'NA'? Or is there maybe an option to change the padding while joining from 'NA' to '0'? Thank you.
Here is the code using python's data structures :
from datatable import dt, f, by, join
DT = dt.Frame(x = ["b"]*3 + ["a"]*3 + ["c"]*3,
y = [1, 3, 6] * 3,
v = range(1, 10))
X = dt.Frame({"x":('c','b'),
"v":(8,7),
"foo":(4,2)})
X.key="x" # key the ``x`` column
merger = DT[:, :, join(X)]
merger
x y v v.0 foo
0 b 1 1 7 2
1 b 3 2 7 2
2 b 6 3 7 2
3 a 1 4 NA NA
4 a 3 5 NA NA
5 a 6 6 NA NA
6 c 1 7 8 4
7 c 3 8 8 4
8 c 6 9 8 4
The NA
is also None; it makes it easy to replace with 0 :
merger.replace(None, 0)
x y v v.0 foo
0 b 1 1 7 2
1 b 3 2 7 2
2 b 6 3 7 2
3 a 1 4 0 0
4 a 3 5 0 0
5 a 6 6 0 0
6 c 1 7 8 4
7 c 3 8 8 4
8 c 6 9 8 4