tensorflow deep-learning keras keras-layer batch-normalization

Keras BatchNorm: Training accuracy increases while Testing accuracy decreases

I am trying to use BatchNorm in Keras. The training accuracy increases over time. From 12% to 20%, slowly but surely. The test accuracy however decreases from 12% to 0%. Random baseline is 12%.

I very much assume this is due to the batchnorm layer (removing the batchnorm layer results in ~12% test accuracy), which maybe does not initialize parameters gamma and beta well enough. Do I have to regard anything special when applying batchnorm? I don't really understand what else could have gone wrong. I have the following model:

model = Sequential()

model.add(BatchNormalization(input_shape=(16, 8)))
model.add(Reshape((16, 8, 1)))

#1. Conv (64 filters; 3x3 kernel)
model.add(default_Conv2D())
model.add(BatchNormalization(axis=3))
model.add(Activation('relu'))

#2. Conv (64 filters; 3x3 kernel)
model.add(default_Conv2D())
model.add(BatchNormalization(axis=3))
model.add(Activation('relu'))

... 

#8. Affine (NUM_GESTURES units) Output layer
model.add(default_Dense(NUM_GESTURES))
model.add(Activation('softmax'))


sgd = optimizers.SGD(lr=0.1)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

default_Conv2D and default_Dense are defined as follows:

def default_Conv2D():
    return Conv2D(
        filters=64,
        kernel_size=3,
        strides=1,
        padding='same',
        # activation=None,
        # use_bias=True,
        # kernel_initializer=RandomNormal(mean=0.0, stddev=0.01, seed=None), #RandomUniform(),
        kernel_regularizer=regularizers.l2(0.0001),
        # bias_initializer=RandomNormal(mean=0.0, stddev=0.01, seed=None), # RandomUniform(),
        # bias_regularizer=None
    )

def default_Dense(units):

    return Dense(
        units=units,
        # activation=None,
        # use_bias=True,
        # kernel_initializer=RandomNormal(mean=0.0, stddev=0.01, seed=None),#RandomUniform(),
        # bias_initializer=RandomNormal(mean=0.0, stddev=0.01, seed=None),#RandomUniform(),
        kernel_regularizer=regularizers.l2(0.0001),
        # bias_regularizer=None
    )

Solution

It seems that there was something broken with Keras itself.

A naive

pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

did the trick.

@wontonimo, thanks a lot for your really great answer!