python tensorflow tensorflow-probability

How to set bijectors for vectorized parameters in TensorFlow?

I am following the logic of the 3rd example of this tutorial on GaussianProcessRegressionModel. However, one of the differences in my setup is that my amplitude and length_scale are vectors. However, I have difficulty setting up the bijectors for vectorized parameters.

I tried one approach from the official example tutorial (click here and search for keyword 'Batching Bijectors').

They used

softplus = tfp.bijectors.Softplus(
  hinge_softness=[1., .5, .1])
print("Hinge softness shape:", softplus.hinge_softness.shape)

to change the shape of the Softplus for scalar parameter. But the console kept showing the same error message.

My compute_joint_log_prob_3 simply outputs the scalar log posterior probability given all the data and parameters. And I have tested that that function works well. The only problem is the setup of unconstrained_bijectors in presence of vectorized kernel hyper- parameters.

# Create a list to save all variables to be iterated.
initial_chain_states = [
    tf.ones([1, num_GPs], dtype=tf.float32, name="init_amp_1"),
    tf.ones([1, num_GPs], dtype=tf.float32, name="init_scale_1"),
    tf.ones([1, num_GPs], dtype=tf.float32, name="init_amp_0"),
    tf.ones([1, num_GPs], dtype=tf.float32, name="init_scale_0"),
    tf.ones([], dtype=tf.float32, name="init_sigma_sq_1"),
    tf.ones([], dtype=tf.float32, name="init_sigma_sq_0")
]

vectorized_sp = tfb.Softplus(hinge_softness=np.ones([1, num_GPs], dtype=np.float32))

unconstrained_bijectors = [
    vectorized_sp,
    vectorized_sp,
    vectorized_sp,
    vectorized_sp,
    tfp.bijectors.Softplus(),
    tfp.bijectors.Softplus()
]

def un_normalized_log_posterior(amplitude_1, length_scale_1,
                                amplitude_0, length_scale_0,
                                noise_var_1, noise_var_0):
    return compute_joint_log_prob_3(
        para_index, delayed_signal, y_type,
        amplitude_1, length_scale_1, amplitude_0, length_scale_0,
        noise_var_1, noise_var_0
    )

num_results = 200
[
    amps_1,
    scales_1,
    amps_0,
    scales_0,
    sigma_sqs_1,
    sigma_sqs_0
], kernel_results = tfp.mcmc.sample_chain(
    num_results=num_results,
    num_burnin_steps=250,
    num_steps_between_results=3,
    current_state=initial_chain_states,
    kernel=tfp.mcmc.TransformedTransitionKernel(
        inner_kernel=tfp.mcmc.HamiltonianMonteCarlo(
            target_log_prob_fn=un_normalized_log_posterior,
            step_size=np.float32(0.1),
            num_leapfrog_steps=3,
            step_size_update_fn=tfp.mcmc.make_simple_step_size_update_policy(
                num_adaptation_steps=100)),
        bijector=unconstrained_bijectors))

It should work and the model will draw samples of this parameters. Instead, I got the bunch of error messages saying that

Traceback (most recent call last):
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1659, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Requires start <= limit when delta > 0: 1/0 for 'mcmc_sample_chain/transformed_kernel_bootstrap_results/mh_bootstrap_results/hmc_kernel_bootstrap_results/maybe_call_fn_and_grads/value_and_gradients/softplus_10/forward_log_det_jacobian/range' (op: 'Range') with input shapes: [], [], [] and with computed input tensors: input[0] = <1>, input[1] = <0>, input[2] = <1>.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/MMAR_q/MMAR_q.py", line 237, in <module>
    bijector=unconstrained_bijectors))
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/mcmc/sample.py", line 235, in sample_chain
    previous_kernel_results = kernel.bootstrap_results(current_state)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/mcmc/transformed_kernel.py", line 344, in bootstrap_results
    transformed_init_state))
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/mcmc/hmc.py", line 518, in bootstrap_results
    kernel_results = self._impl.bootstrap_results(init_state)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/mcmc/metropolis_hastings.py", line 264, in bootstrap_results
    pkr = self.inner_kernel.bootstrap_results(init_state)
  File "/MAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/mcmc/hmc.py", line 687, in bootstrap_results
    ] = mcmc_util.maybe_call_fn_and_grads(self.target_log_prob_fn, init_state)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/mcmc/util.py", line 237, in maybe_call_fn_and_grads
    result, grads = _value_and_gradients(fn, fn_arg_list, result, grads)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/mcmc/util.py", line 185, in _value_and_gradients
    result = fn(*fn_arg_list)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/mcmc/transformed_kernel.py", line 204, in new_target_log_prob
    event_ndims=event_ndims)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/mcmc/transformed_kernel.py", line 51, in fn
    for b, e, sp in zip(bijector, event_ndims, transformed_state_parts)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/mcmc/transformed_kernel.py", line 51, in <listcomp>
    for b, e, sp in zip(bijector, event_ndims, transformed_state_parts)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/bijectors/bijector.py", line 1205, in forward_log_det_jacobian
    return self._call_forward_log_det_jacobian(x, event_ndims, name)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/bijectors/bijector.py", line 1177, in _call_forward_log_det_jacobian
    kwargs=kwargs)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/bijectors/bijector.py", line 982, in _compute_inverse_log_det_jacobian_with_caching
    event_ndims)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/bijectors/bijector.py", line 1272, in _reduce_jacobian_det_over_event
    axis=self._get_event_reduce_dims(min_event_ndims, event_ndims))
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow_probability/python/bijectors/bijector.py", line 1284, in _get_event_reduce_dims
    return tf.range(-reduce_ndims, 0)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py", line 1199, in range
    return gen_math_ops._range(start, limit, delta, name=name)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 6746, in _range
    "Range", start=start, limit=limit, delta=delta, name=name)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1823, in __init__
    control_input_ops)
  File "/MMAR_q/venv/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1662, in _create_c_op
    raise ValueError(str(e))
ValueError: Requires start <= limit when delta > 0: 1/0 for 'mcmc_sample_chain/transformed_kernel_bootstrap_results/mh_bootstrap_results/hmc_kernel_bootstrap_results/maybe_call_fn_and_grads/value_and_gradients/softplus_10/forward_log_det_jacobian/range' (op: 'Range') with input shapes: [], [], [] and with computed input tensors: input[0] = <1>, input[1] = <0>, input[2] = <1>.

I don't know what those input shapes in the end mean exactly. Thank you for your time and explanation.

------- I am artificial separation line ------

After discussing with Brian, I know where I am wrong. The error message probably means that the outcome of compute_joint_log_prob_3 is not a scalar but with other shape.

As Brian said yesterday, Softplus() is able to broadcast automatically based on the tensor it is fed on. If I want to change the softness of it, then I can modify hinge_softness=....

And I also gained deeper understanding after I read the tutorial on tensorflow distribution shape.

Thank you for your clarification again... What a bright day it is after I know where I am wrong...

Solution

If you just want the same softplus with hinge softness of 1, the bijector will broadcast and you can just write:

vectorized_sp = tfb.Softplus(hinge_softness=np.float32(1)) Also note that the default is one, so even simpler: vectorized_sp = tfb.Softplus()

Separately, I'd suggest looking at the SimpleStepSizeAdaptation kernel (might only be in pip install tfp-nightly currently).

I think the actual exception you are seeing is probably caused by the bijector parameter shape conflicting somehow with your latent state shape. The transformed transition kernel needs to reduce the log_prob over the event dims specified by the bijector. The event_ndims for each latent is derived using the rank of the log_prob you return from target_log_prob_fn as the target batch rank, i.e. the trailing event dimensions will be reduced by the bijector.

Can you say a bit more about what you're trying to do? It looks like you're trying to run a single chain of MCMC over a bunch of GP kernel hparams. It's pretty hard to offer much help, not seeing the internals of compute_joint_log_prob_3.