I have read a similar post on SO, but was not able to adapt the answer to my specific case. I am working with time series data and would like to combine two different data sets into the same plot. Although I could combine the data into one dataframe, I am really interested in understanding how to reference multiple datasets.
Mock Data:
require(ggvis)
dfa <- data.frame(
date_a = seq(from= as.Date("2015-06-10"),
to= as.Date("2015-07-01"), by= 1),
val_a = c(2585.150, 2482.200, 3780.186, 3619.601,
0.000, 0.000, 3509.734, 3020.405,
3271.897, 3019.003, 3172.084, 0.000,
0.000, 3319.927, 2673.428, 3331.382,
3886.957, 2859.887, 0.000, 0.000,
2781.443, 2847.377) )
dfb <- data.frame(
date_b = seq(from= as.Date("2015-07-02"),
to= as.Date("2015-07-15"), by= 1),
val_b = c(3250.75429, 3505.43477, 3208.69141,
-2.08175, -27.30244, 3324.62348,
2820.91075, 3250.75429, 3505.43477,
3208.69141, -2.08175, -27.30244,
3324.62348, 2820.91075) )
Using the data provided above, I am able to create separate plots with the code below:
Separate Plots: (Works)
dfa %>%
ggvis( x= ~date_a , y= ~val_a, stroke := "black", opacity := 0.5 ) %>%
scale_datetime("x", nice = "month", domain = c(as.Date("2015-06-10"),
as.Date("2015-07-15") )) %>%
layer_lines() %>% layer_points( fill := "black" )
dfb %>%
ggvis( x= ~date_b , y= ~val_b, stroke := "red", opacity := 0.5 ) %>%
scale_datetime("x", nice = "month", domain = c(as.Date("2015-06-10"),
as.Date("2015-07-15") )) %>%
layer_lines() %>% layer_points( fill := "red" )
The desired output is these two lines (black and red) to be on the same plot. Here are a couple of unsuccessful attempts:
Attempt #1 adapted from SO post:
ggvis( data = dfa, x = ~date_a, y = ~val_a) %>% layer_lines(stroke := "black", opacity := 0.5 ) %>%
layer_lines( data = dfb, x= ~date_b , y= ~val_b, stroke := "red",
opacity := 0.5 ) %>%
scale_datetime("x", nice = "month", domain = c(as.Date("2015-06-10"),
as.Date("2015-07-15") ))
## Error in new_prop.default(x, property, scale, offset, mult, env, event, :
## Unknown input to prop: c(16618, 16619, 16620, 16621, 16622, 16623, 16624, ...
Attempt #2 based on RStudio documentation:
ggvis( data = NULL, x = ~date_a, y = ~val_a) %>%
layer_lines(stroke := "black", opacity := 0.5, data = dfa ) %>%
layer_lines( x= ~date_b , y= ~val_b, stroke := "red",
opacity := 0.5, data = dfb ) %>%
scale_datetime("x", nice = "month", domain = c(as.Date("2015-06-10"),
as.Date("2015-07-15") ))
## Error in func() : attempt to apply non-function
Here is a minimalistic implementation in ggplot2:
require(ggplot2)
ggplot() +
geom_line(data = dfa, aes(x = date_a, y = val_a ), colour = "black") +
geom_line(data = dfb, aes(x = date_b, y = val_b ), colour = "red")
Again, a working solution and brief explanation would be greatly appreciated. Thank you in advance for the help.
Well, it looks like layer_lines
may not properly by taking the data
argument. I think you can successfully use layer_paths
here. They work similarly, but layer_paths
works in the order of the data so you'd need to make sure your time series are arranged correctly before plotting.
First, when I look at the layer_paths
basic function it, like many other layer functions, has a specific data argument.
layer_paths
function (vis, ..., data = NULL)
{
add_mark(vis, "line", props(..., env = parent.frame()), data,
deparse2(substitute(data)))
}
<environment: namespace:ggvis>
While layer_lines
has the ...
for more arguments, it doesn't have a data
argument and it doesn't seem like things work with it.
layer_lines
function (vis, ...)
{
x_var <- vis$cur_props$x$value
layer_f(vis, function(x) {
x <- auto_group(x, exclude = c("x", "y"))
x <- dplyr::arrange_(x, x_var)
emit_paths(x, props(...))
})
}
<environment: namespace:ggvis>
To test, I made a really basic graph, trying to use the data
argument in layer_lines
.
ggvis() %>%
layer_lines(data = dfb, x= ~date_b , y= ~val_b, stroke := "red")
This fails with an error.
Error in func() : attempt to apply non-function
Here's the same code using layer_paths
instead:
ggvis() %>%
layer_paths(data = dfb, x= ~date_b , y= ~val_b, stroke := "red")
So, that works, which means as long as you order your dataset by your dates your graphic should work fine by just replacing layer_lines
with layer_paths
.
ggvis(data = dfa, x = ~date_a, y = ~val_a) %>%
layer_paths(stroke := "black", opacity := 0.5 ) %>%
layer_paths(data = dfb, x = ~date_b , y= ~val_b, stroke := "red",
opacity := 0.5 ) %>%
scale_datetime("x", nice = "month", domain = c(as.Date("2015-06-10"), as.Date("2015-07-15") ))
This seemed odd to me, and I have missed something. I didn't see anything in the open or closed issues on the ggvis
github page and you might consider filing one.