I have monthly observed and modeled data organized as vectors:
obs <- structure(c(68.72228685137, 68.4565130874024, 68.3237563140977,
66.1789683147099, 63.7162100107148, 59.9698454002755), .Names = c("X1901.01.01",
"X1901.02.01", "X1901.03.01", "X1901.04.01", "X1901.05.01", "X1901.06.01"
))
mod <- structure(c(71.5796750030741, 71.5925210418478, 70.8672045288309,
67.9705857323206, 68.462614970737, 67.7095309202574), .Names = c("X1899.11.01",
"X1899.12.01", "X1901.01.01", "X1901.02.01", "X1901.03.01", "X1901.04.01"
))
where X1901.01.01 corresponds to 1901-01-01 and so on. Please note that dates in observed and modeled data don't overlap completely.
This is just a sample - my real data contains thousands of observations.
What is the most efficient (i.e. fastest) way to combine these vectors in a data frame assigning NA
to non-matching dates and getting rid of the infamous "X" in the front of the original dates?
This would be the resulting data frame:
date obs mod
1899.11.01 NA 71.57968
1899.12.01 NA 71.59252
1901.01.01 68.72229 70.86720
1901.02.01 68.45651 67.97059
1901.03.01 68.32376 68.46261
1901.04.01 66.17897 67.70953
1901.05.01 63.71621 NA
1901.06.01 59.96985 NA
While @Alex A.'s answer works, since it's date/time data it might be beneficial to treat it this way from the beginning. You can easily merge these using the merge() function with the all=TRUE flag set, which will merge on any identical column names:
obs <- as.data.frame(obs)
mod <- as.data.frame(mod)
obs[["date"]] <- as.Date(row.names(obs), "X%Y.%m.%d")
mod[["date"]] <- as.Date(row.names(mod), "X%Y.%m.%d")
d <- merge(obs, mod, all=TRUE)
Since the date columns are date/time class you could then easily convert the data.frame to an xts time series or something else for subsetting, summarizing, etc.