PyMC3 binomial switchpoint model highly dependent on testval

I've set up the following binomial switchpoint model in PyMC3:

with pm.Model() as switchpoint_model:

    switchpoint = pm.DiscreteUniform('switchpoint', lower=df['covariate'].min(), upper=df['covariate'].max())

    # Priors for pre- and post-switch parameters
    early_rate = pm.Beta('early_rate', 1, 1)
    late_rate = pm.Beta('late_rate', 1, 1)

    # Allocate appropriate binomial probabilities to years before and after current
    p = pm.math.switch(switchpoint >= df['covariate'].values, early_rate, late_rate)

    p = pm.Deterministic('p', p)

    y = pm.Binomial('y', p=p, n=df['trials'].values, observed=df['successes'].values)

It seems to run fine, except that it entirely centers in on one value for the switchpoint (999), as shown below.

Upon further investigation it seems that the results for this model are highly dependent on the starting value (in PyMC3, "testval"). The below shows what happens when I set the testval = 750.

switchpoint = pm.DiscreteUniform('switchpoint', lower=gp['covariate'].min(), 
upper=gp['covariate'].max(), testval=750)

I get similarly different results with additional different starting values.

For context, this is what my dataset looks like:

My questions are:

Is my model somehow incorrectly specified?
If it's correctly specified, how should I interpret these results? In particular, how do I compare / select results generated by different testvals? The only idea I've had has been using WAIC to evaluate out of sample performance...

Solution

Models with discrete values can be problematic, all the nice sampling techniques using the derivatives don't work anymore, and they can behave a lot like multi modal distributions. I don't really see why this would be that problematic in this case, but you could try to use a continuous variable for the switchpoint instead (wouldn't that also make more sense conceptually?).