tensorflow keras neural-network keras-layer

Keras: How to create a sparsely connected layer?

I want to have neural network where the nodes in the input layer just connected to some nodes in the hidden layer. In small it should look similar to this: example

My original problem has 9180 input nodes and 230 hidden nodes (these numbers refer to the biological data I am taking as input). I know which Input node is connected to which hidden node and this information lays in a matrix(1:there is a connection, 0:there is no connection) of the shape (9180,230).

Here is a code example how I create my model:

import tensorflow as tf
import tensorflow.contrib.eager as tfe
import numpy as np

tf.enable_eager_execution()


model = tf.keras.Sequential([
  tf.keras.layers.Dense(2, activation=tf.sigmoid, input_shape=(2,)), 
  tf.keras.layers.Dense(2, activation=tf.sigmoid)
])

mask =np.array([[0, 1],[1,1]])


#define the loss function
def loss(model, x, y):
  y_ = model(x)
  return tf.losses.mean_squared_error(labels=y, predictions=y_)

#define the gradient calculation
def grad(model, inputs, targets):
  with tf.GradientTape() as tape:
    loss_value = loss(model, inputs, targets)
  return loss_value, tape.gradient(loss_value, model.trainable_variables) 

#create optimizer an global Step
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
global_step = tf.train.get_or_create_global_step()


#optimization step
loss_value, grads = grad(model, features, labels)
optimizer.apply_gradients(zip(grads, model.variables),global_step)

I do not want that during training new connections will come up. As I need this special kind of architecture of the hidden layer to analyse my biological problem.

Solution

You can multiply the weights of the layer with the binary mask, that you have. For example, let's suppose, you have 4 inputs and 3 outputs. Now you have weight matrix between these layer is of dim (4,3). And you also have mask matrix, which tell about connection. Now point-wise multiply both matrix, and you are good to go.

weight =[[0.20472841, 0.16867633, 0.337205  ],
       [0.05087094, 0.07719579, 0.23244687],
       [0.86705386, 0.64144604, 0.11517534],
       [0.57614114, 0.26831522, 0.31417855]]

mask =[[1, 0, 1],
       [0, 0, 1],
       [0, 1, 1],
       [0, 0, 0]]

new_weight = multiply(weight, mask) #point wise
new_weight = [[0.20472841, 0.        , 0.337205  ],
               [0.        , 0.        , 0.23244687],
               [0.        , 0.64144604, 0.11517534],
               [0.        , 0.        , 0.        ]]

Note: You can use tensorflow low-level API to define this structure.