How do I implement a custom activation function (RBF kernel with mean and variances adjusted by gradient descent) in Neupy or Theano for use in Neupy.
{Quick Background: Gradient Descent works with every parameter in the network. I want to make a specialized features space that contains optimized feature parameters so Neupy}
I think my problems is in the creation of parameters, how they are sized, and how they are all connected.
Primary functions of interest.
class RBF(layers.ActivationLayer):
def initialize(self):
super(RBF, self).initialize()
self.add_parameter(name='mean', shape=(1,),
value=init.Normal(), trainable=True)
self.add_parameter(name='std_dev', shape=(1,),
value=init.Normal(), trainable=True)
def output(self, input_value):
return rbf(input_value, self.parameters)
def rbf(input_value, parameters):
K = _outer_substract(input_value, parameters['mean'])
return np.exp(- np.linalg.norm(K)/parameters['std_dev'])
def _outer_substract(x, y):
return (x - y.T).T
Help will be much appreciated as this is will provide great insight into how to customize neupy networks. The documentation could use some work in some areas to say the least...
When layer changes shape of the input variable it has to inform the subsequent layers about the change. For this case it must have customized output_shape
property. For example:
from neupy import layers
from neupy.utils import as_tuple
import theano.tensor as T
class Flatten(layers.BaseLayer):
"""
Slight modification of the Reshape layer from the neupy library:
https://github.com/itdxer/neupy/blob/master/neupy/layers/reshape.py
"""
@property
def output_shape(self):
# Number of output feature depends on the input shape
# When layer receives input with shape (10, 3, 4)
# than output will be (10, 12). First number 10 defines
# number of samples which you typically don't need to
# change during propagation
n_output_features = np.prod(self.input_shape)
return (n_output_features,)
def output(self, input_value):
n_samples = input_value.shape[0]
return T.reshape(input_value, as_tuple(n_samples, self.output_shape))
If you run it in terminal you will see that it works
>>> network = layers.Input((3, 4)) > Flatten()
>>> predict = network.compile()
>>> predict(np.random.random((10, 3, 4))).shape
(10, 12)
In your example I can see a few issues:
rbf
function doesn't return theano expression. It should fail during the function compilation np.linalg.norm
will return you scalar if you won't specify axis along which you want to calculate norm. The following solution should work for you
import numpy as np
from neupy import layers, init
import theano.tensor as T
def norm(value, axis=None):
return T.sqrt(T.sum(T.square(value), axis=axis))
class RBF(layers.BaseLayer):
def initialize(self):
super(RBF, self).initialize()
# It's more flexible when shape of the parameters
# denend on the input shape
self.add_parameter(
name='mean', shape=self.input_shape,
value=init.Constant(0.), trainable=True)
self.add_parameter(
name='std_dev', shape=self.input_shape,
value=init.Constant(1.), trainable=True)
def output(self, input_value):
K = input_value - self.mean
return T.exp(-norm(K, axis=0) / self.std_dev)
network = layers.Input(1) > RBF()
predict = network.compile()
print(predict(np.random.random((10, 1))))
network = layers.Input(4) > RBF()
predict = network.compile()
print(predict(np.random.random((10, 4))))