I'm going through the "Introduction to Modeling in Gen" Notebook at https://github.com/probcomp/gen-quickstart
Section 5 (Calling other generative functions) asks to "Construct a data set for which it is ambiguous whether the line or sine wave model is best"
I'm having a hard problem understanding how I work with the traces (and returns) of the component functions to create a meaningful higher-order trace that I can use.
To me the most straightforward "ambiguous" model is line(xs).+sine(xs)
. So I Gen.simulate
ed line
and sine
to get the traces and adding them together, like this:
@gen function combo(xs::Vector{Float64})
my_sin = simulate(sine_model_2,(xs,))
my_lin = simulate(line_model_2,(xs,))
if @trace(bernoulli(0.5), :is_line)
@trace(normal(get_choices(my_lin)[:slope], 0.01), :slope)
@trace(normal(get_choices(my_lin)[:intercept], 0.01), :intercept)
@trace(normal(get_choices(my_lin)[:noise], 0.01), :noise)
else
@trace(normal(get_choices(my_sin)[:phase], 0.01), :phase)
@trace(normal(get_choices(my_sin)[:period], 0.01), :period)
@trace(normal(get_choices(my_sin)[:amplitude], 0.01), :amplitude)
@trace(normal(get_choices(my_sin)[:noise], 0.01), :noise)
end
combo = [get_choices(my_sin)[(:y, i)] + get_choices(my_lin)[(:y, i)] for i=1:length(xs)]
for (i, c) in enumerate(combo)
@trace(normal(c, 0.1), (:y, i))
end
end;
This is clearly wrong and I know I'm missing something fundamental in the whole idea of traces and prob programming in Gen.
I'd expect to be able to introspect sine/line_model's trace from within combo, and do element-wise addition on the traces to get a new trace. And not have to randomly pick a number close to :intercept, :phase, etc. so I can include it in my trace later on.
By the way, when I do:
traces = [Gen.simulate(combo,(xs,)) for _=1:12];
grid(render_combined, traces)
Please help thanks!
Hi there — thanks for your interest in Gen! :)
Addresses of the combined model's trace
The combined model from the tutorial looks like this:
@gen function combined_model(xs::Vector{Float64})
if @trace(bernoulli(0.5), :is_line)
@trace(line_model_2(xs))
else
@trace(sine_model_2(xs))
end
end;
Its traces will have the following addresses:
:is_line
, storing a Boolean indicating whether the generated dataset was linear or not.line_model_2
or sine_model_2
, depending on which was called. Note that traces of both line_model_2
and sine_model_2
contain the addresses (:y, i)
for each integer i
between 1
and length(xs)
. Because of this, so will combined_model
's traces: these are the addresses representing the final sampled y
values, regardless of which of the two processes generated them.
Constructing a new dataset
The question to "construct a data set for which it is ambiguous whether the line or sine wave model is best" does not require writing a new generative function (with @gen
), but rather, constructing a list of xs
and a list of ys
(in plain Julia) that you think might make a difficult-to-disambiguate dataset. You can then pass your xs
and ys
into the do_inference
function defined earlier in the notebook, to see what the system concludes about your dataset. Note that the do_inference
function constructs a constraint choicemap that constrains each (:y, i)
to the value ys[i]
from the dataset you passed in. This works because (:y, i)
is always the name of the ith datapoint, no matter the value of :is_line
.
Updating / manipulating traces
You write:
I'd expect to be able to introspect sine/line_model's trace from within combo, and do element-wise addition on the traces to get a new trace. And not have to randomly pick a number close to :intercept, :phase, etc. so I can include it in my trace later on.
You can certainly call simulate
twice to get two traces, outside a generative function like combo
. But traces cannot be manipulated in arbitrary ways (e.g. "elementwise addition"): as data structures, traces maintain certain invariants, like always knowing the exact probability of their current values under the model that generated them, and always holding values that actually could have been generated from the model.
The dictionary-like data structure you're looking for is a choicemap. Choicemaps are mutable and can be built up to include arbitrary values at arbitrary addresses. For example, you can write:
observations = Gen.choicemap()
for (i, y) in enumerate(ys)
observations[(:y, i)] = y
end
Choicemaps can be used as constraints to generate new traces (using Gen.generate
), as arguments to Gen's low-level Gen.update
method (with allows you to update a trace while recomputing any relevant probabilities, and erroring if your updates are invalid), and in several other places.
Hope that helps :)