Search code examples
tensorflowkerasneural-networkkeras-layer

Keras: How to create a sparsely connected layer?


I want to have neural network where the nodes in the input layer just connected to some nodes in the hidden layer. In small it should look similar to this: example

My original problem has 9180 input nodes and 230 hidden nodes (these numbers refer to the biological data I am taking as input). I know which Input node is connected to which hidden node and this information lays in a matrix(1:there is a connection, 0:there is no connection) of the shape (9180,230).

Here is a code example how I create my model:

import tensorflow as tf
import tensorflow.contrib.eager as tfe
import numpy as np

tf.enable_eager_execution()


model = tf.keras.Sequential([
  tf.keras.layers.Dense(2, activation=tf.sigmoid, input_shape=(2,)), 
  tf.keras.layers.Dense(2, activation=tf.sigmoid)
])

mask =np.array([[0, 1],[1,1]])


#define the loss function
def loss(model, x, y):
  y_ = model(x)
  return tf.losses.mean_squared_error(labels=y, predictions=y_)

#define the gradient calculation
def grad(model, inputs, targets):
  with tf.GradientTape() as tape:
    loss_value = loss(model, inputs, targets)
  return loss_value, tape.gradient(loss_value, model.trainable_variables) 

#create optimizer an global Step
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
global_step = tf.train.get_or_create_global_step()


#optimization step
loss_value, grads = grad(model, features, labels)
optimizer.apply_gradients(zip(grads, model.variables),global_step)

I do not want that during training new connections will come up. As I need this special kind of architecture of the hidden layer to analyse my biological problem.


Solution

  • You can multiply the weights of the layer with the binary mask, that you have. For example, let's suppose, you have 4 inputs and 3 outputs. Now you have weight matrix between these layer is of dim (4,3). And you also have mask matrix, which tell about connection. Now point-wise multiply both matrix, and you are good to go.

    weight =[[0.20472841, 0.16867633, 0.337205  ],
           [0.05087094, 0.07719579, 0.23244687],
           [0.86705386, 0.64144604, 0.11517534],
           [0.57614114, 0.26831522, 0.31417855]]
    
    mask =[[1, 0, 1],
           [0, 0, 1],
           [0, 1, 1],
           [0, 0, 0]]
    
    new_weight = multiply(weight, mask) #point wise
    new_weight = [[0.20472841, 0.        , 0.337205  ],
                   [0.        , 0.        , 0.23244687],
                   [0.        , 0.64144604, 0.11517534],
                   [0.        , 0.        , 0.        ]]
    

    Note: You can use tensorflow low-level API to define this structure.