Search code examples
pythonkeraskeras-layer

How to tie weights to train multiple copies of the same model


I need to implement the model below.

I have to do function A and function B to a sequence of data. Function A is implemented using a neural network and its output is input to the function B (Not a neural network, but implemented in Keras using Model functional API), and then the loss function is calculated at the output of the function B.

Input is L length vector of complex numbers. I need to input this to L copies of the same network (Sequential) in parallel. One network takes real and imaginary numbers of one element and outputs m real numbers.

So all the L networks will output mL real numbers in total. The function B will take these mL real numbers as input and calculate the output.

This is how I roughly planned,

model_inputs = keras.layers.Input(shape=(L,))

function_A_model = Sequential()
function_A_model.add(Dense(32, activation='relu'))
function_A_model.add(Dense(m))  # Output layer

function_A_inputs = [layers.Input(shape=(2,)) for i in range(L)]

function_A_outputs = []
for i in range(L):
    function_A_outputs = [function_A_outputs function_A_model(function_A_inputs[i]) ]

function_B_outputs = function_B(function_A_outputs)

I want to implement this as a larger Model using Model functional API, which will take model_inputs as above and output function_B_outputs.

My problems are,

  1. I need to divide the model_inputs Input vector for L Input vectors of shape 2. How can I accomplish this in layers? or is it ok to have a vector of inputs?

  2. How do I implement the function A in L copies of the same network (weights are tied)

  3. How do I merge the m*L outputs to one Output vector so I can input it to function B?


Solution

  • My problems are, I need to divide the model_inputs Input vector for L Input vectors of shape 2. How can I accomplish this in layers?

    You can define a Lambda layer that slices the input. For instance:

    example = Input(shape=(20,))
    slice_0_4 = Lambda(lambda x: x[:, :4])(example)
    slice_4_16 = Lambda(lambda x: x[:, 4:16])(example)
    slice_16_20 = Lambda(lambda x: x[:, 16:])(example)
    

    or is it ok to have a vector of inputs?

    You can have a tensor of any shape you want including a shape that is (N, M).

    For instance if you declare a model that is:

    L = Input(shape=(20, 10,))
    h1 = Dense(8)(L)
    

    The dense layer above will be applied to all 20 time steps of the input. Dense layers share weights across time steps. So the same w matrix will be multiplied across all batches across all time steps, doing a 10x8 matrix multiplication.

    How do I implement the function A in L copies of the same network (weights are tied)

    I'm not sure I follow your question. You can have the top level model split its input and call a sub model with a slice of the input; or you can have a model that performs the same set of operations on a matrix that includes extra dimensions.

    How do I merge the m*L outputs to one Output vector so I can input it to function B?

    You can use the keras Concatenate layer.