Search code examples
theanoconv-neural-networklasagnenolearn

Convolutional Neural Network - Visualizing weights


Main Problem

I cannot understand the Plot of the weights of a specific layer. I used a method from no-learn : plot_conv_weights(layer, figsize=(6, 6))

Im using lasagne as my neural-network library.

The plot comes out fine, but I dont know how i should interpret it.

Neural Network Structure

The structure im using :

InputLayer  1x31x31

Conv2DLayer 20x3x3  
Conv2DLayer 20x3x3  
Conv2DLayer 20x3x3  

MaxPool2DLayer  2x2 

Conv2DLayer 40x3x3  
Conv2DLayer 40x3x3
Conv2DLayer 40x3x3  

MaxPool2DLayer  40x2x2  


DropoutLayer        

DenseLayer  96  
DropoutLayer    96  

DenseLayer  32  
DropoutLayer    32  

DenseLayer  1 as sigmoid

Here are the weights of the first 3 Layers :

"Image one "

"Image two "

"Image Three "

** About the Images **

So for me, they look random and i cannot interpret them!

However, on Cs231, it says the following :

Conv/FC Filters. The second common strategy is to visualize the weights. These are usually most interpretable on the first CONV layer which is looking directly at the raw pixel data, but it is possible to also show the filter weights deeper in the network. The weights are useful to visualize because well-trained networks usually display nice and smooth filters without any noisy patterns. Noisy patterns can be an indicator of a network that hasn’t been trained for long enough, or possibly a very low regularization strength that may have led to overfitting http://cs231n.github.io/understanding-cnn/

Then why mine are random?

The structure is trained and performs well for its task.

References

http://cs231n.github.io/understanding-cnn/

https://github.com/dnouri/nolearn/blob/master/nolearn/lasagne/visualize.py

Solution

  • Normally when you visualize the weights you want to check 2 things:

    • That they are smooth and cover a wide range of values, i.e. it's not a bunch of 1's and 0's. That would mean the non-linearity is being saturated.
    • That they have some kind of structure. Normally you tend to see oriented edges although this is more difficult to see when you have small filters like 3x3.

    That being said, your weights do not appear to be saturated, but they indeed seem to be too random. During training, did the network converge correctly? I am also surprised at how big your filters are (30x30). Not sure what you are trying to accomplish with that.