I have 2 datasets with time series.
In the dataset 1, the are 3 columns: Date
, price changes
, volume
. And 1056 rows sorting by dates that start from 01-01-2005
to 31-12-2015
.
In the dataset 2, the are 3 columns: Date
, price changes
, volume
. And 1028 rows sorting by dates that start from 01-01-2005
to 31-12-2015
. But only 1028 rows because there is missing data (i.e. missing rows).
I would like to erase in dataset 1 the rows with the dates that do not appear in dataset 2.
I have tried this but it does not work:
dataset1[!rownames(dataset1) %in% dataset2$Date, ]
The output has 1056 rows, so it does not erase anything.
You just need to do dataset1[dataset1$Date %in% dataset2$Date, ]
:
set.seed(1)
d1 <- as.Date('2015-01-01') + 0:10
x <- sample(1:10, 11, replace = TRUE)
df1 <- data.frame(d1, x)
d1 x
1 2015-01-01 3
2 2015-01-02 4
3 2015-01-03 6
4 2015-01-04 10
5 2015-01-05 3
6 2015-01-06 9
7 2015-01-07 10
8 2015-01-08 7
9 2015-01-09 7
10 2015-01-10 1
11 2015-01-11 3
d2 <- as.Date('2015-01-01') + seq(0, 10, 3)
y <- sample(1:10, 4)
df2 <- data.frame(d2, y)
df1[df1$d1 %in% df2$d2, ]
d1 x
1 2015-01-01 3
4 2015-01-04 10
7 2015-01-07 10
10 2015-01-10 1