Search code examples

Tensorflow Probability Logistic Regression Example

I feel I must be missing something obvious, in struggling to get a positive control for logistic regression going in tensorflow probability.

I've modified the example for logistic regression here, and created a positive control features and labels data. I struggle to achieve accuracy over 60%, however this is an easy problem for a 'vanilla' Keras model (accuracy 100%). What am I missing? I tried different layers, activations, etc.. With this method of setting up the model, is posterior updating actually being performed? Do I need to specify an interceptor object? Many thanks..

### Added positive control
nSamples = 80
features1 = np.float32(np.hstack((np.reshape(np.ones(40), (40, 1)), 
        np.reshape(np.random.randn(nSamples), (40, 2)))))
features2 = np.float32(np.hstack((np.reshape(np.zeros(40), (40, 1)), 
        np.reshape(np.random.randn(nSamples), (40, 2)))))
features = np.vstack((features1, features2))
labels = np.concatenate((np.zeros(40), np.ones(40)))
featuresInt, labelsInt = build_input_pipeline(features, labels, 10)

#w_true, b_true, features, labels = toy_logistic_data(FLAGS.num_examples, 2) 
#featuresInt, labelsInt = build_input_pipeline(features, labels, FLAGS.batch_size)

with tf.name_scope("logistic_regression", values=[featuresInt]):
    layer = tfp.layers.DenseFlipout(
    logits = layer(featuresInt)
    labels_distribution = tfd.Bernoulli(logits=logits)

neg_log_likelihood = -tf.reduce_mean(labels_distribution.log_prob(labelsInt))
kl = sum(layer.losses)
elbo_loss = neg_log_likelihood + kl

predictions = tf.cast(logits > 0, dtype=tf.int32)
accuracy, accuracy_update_op = tf.metrics.accuracy(
    labels=labelsInt, predictions=predictions)

with tf.name_scope("train"):
    optimizer = tf.train.AdamOptimizer(learning_rate=FLAGS.learning_rate)
    train_op = optimizer.minimize(elbo_loss)

init_op =,

with tf.Session() as sess:

    # Fit the model to data.
    for step in range(FLAGS.max_steps):
        _ =[train_op, accuracy_update_op])
        if step % 100 == 0:
            loss_value, accuracy_value =[elbo_loss, accuracy])
            print("Step: {:>3d} Loss: {:.3f} Accuracy: {:.3f}".format(
                step, loss_value, accuracy_value))

### Check with basic Keras
kerasModel = tf.keras.models.Sequential([
optimizer = tf.train.AdamOptimizer(5e-2)
kerasModel.compile(optimizer = optimizer, loss = 'binary_crossentropy', 
    metrics = ['accuracy']), labels, epochs = 50) #100% accuracy


  • Compared to the github example, you forgot to divide by the number of examples when defining the KL divergence:

    kl = sum(layer.losses) / FLAGS.num_examples

    When I change this to your code, I quickly get to an accuracy of 99.9% on your toy data.

    Additionaly, the output layer of your Keras model actually expects a sigmoid activation for this problem (binary classification):

    kerasModel = tf.keras.models.Sequential([
        tf.keras.layers.Dense(1, activation='sigmoid')])

    It's a toy problem, but you will notice that the model gets to 100% accuracy faster with a sigmoid activation.