Search code examples
keraskeras-layer

What does the "[0][0]" of the layers connected to in keras model.summary mean?


As is depicted in the following table, what does the [0][0] of the input_1[0][0] mean?

__________________________________________________
Layer (type)          Output Shape  Param # Connected to           
===================================================================
input_1 (InputLayer)  (None, 1)     0                              
___________________________________________________________________
dropout_1 (Dropout)   (None, 1)     0       input_1[0][0]          
___________________________________________________________________
dropout_2 (Dropout)   (None, 1)     0       input_1[0][0]          
===================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
___________________________________________________________________

Solution

  • That's a good question, however to answer it we must dive into the internals of how layers are connected to each other in Keras. So let's start:

    0) What is a tensor?

    A tensor is a data structure that represent data and they are basically n-dimensional arrays. All the data and information passed between layers must be tensors.

    1) What is a layer?

    In the simplest sense, a layer is a computation unit where it gets one or more input tensors, then applies a set of operations (e.g. multiplication, addition, etc.) on them, and gives the result as one or more output tensors. When you apply a layer on some input tensors, under the hood a Node is created.

    2) So what is a Node?

    To represent the connectivity between two layers, Keras internally uses an object of Node class. When a layer is applied on some new input, a node is created and is added to the _inbound_nodes property of that layer. Further, when the output of a layer is used by another layer, a new node is created and is added to _outbound_nodes property of that layer. So essentially, this data structure lets Keras to find out how layers are connected to each other using the following properties of an object of type Node:

    • input_tensors: it is a list containing the input tensors of the node.
    • output_tensors: it is a list containing the output tensors of the node.
    • inbound_layers: it is a list which contains the layers where the input_tensors come from.
    • outbound_layers: the consumer layers, i.e. the layers that takes input_tensors and turns them into output_tensors.
    • node_indices: it is a list of integers which contains the node indices of input_tensors (would explain this more in the answer to the following question).
    • tensor_indices: it is a list of integers which contains the indices of input_tensors within their corresponding inbound layer (would explain this more in the answer to the following question).

    3) Fine! Now tell me what those values in "Connected to" column of model summary mean?

    To better understand this, let's create a simple model. First, let's create two input layers:

    inp1 = Input((10,))
    inp2 = Input((20,))
    

    Next, we create a Lambda layer that has two output tensors, the first output is the input tensor divided by 2, and the second output is the input tensor multiplied by 2:

    lmb_layer = Lambda(lambda x: [x/2, x*2])
    

    Let's apply this lambda layer on inp1 and inp2:

    a1, b1 = lmb_layer(inp1)
    a2, b2 = lmb_layer(inp2)
    

    After doing this, two nodes have been created and added to _inbound_nodes property of lmb_layer:

    >>> lmb_layer._inbound_nodes
    [<keras.engine.base_layer.Node at 0x7efb9a105588>,
     <keras.engine.base_layer.Node at 0x7efb9a105f60>]
    

    The first node corresponds to the connectivity of the lmb_layer with the first input layer (inp1) and the second node corresponds to the connectivity of this layer with the second input layer (inp2). Further, each of those nodes have two output tensors (corresponding to a1,b1 and a2,b2):

    >>> lmb_layer._inbound_nodes[0].output_tensors
    [<tf.Tensor 'lambda_1/truediv:0' shape=(?, 10) dtype=float32>,
     <tf.Tensor 'lambda_1/mul:0' shape=(?, 10) dtype=float32>]
    
    >>> lmb_layer._inbound_nodes[1].output_tensors
    [<tf.Tensor 'lambda_1_1/truediv:0' shape=(?, 20) dtype=float32>,
     <tf.Tensor 'lambda_1_1/mul:0' shape=(?, 20) dtype=float32>]
    

    Now, let's create and apply four different Dense layers and apply them on the four output tensors we obtained:

    d1 = Dense(10)(a1)
    d2 = Dense(20)(b1)
    d3 = Dense(30)(a2)
    d4 = Dense(40)(b2)
    
    model = Model(inputs=[inp1, inp2], outputs=[d1, d2, d3, d4])
    model.summary()
    

    The model summary would look like this:

    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    input_1 (InputLayer)            (None, 10)           0                                            
    __________________________________________________________________________________________________
    input_2 (InputLayer)            (None, 20)           0                                            
    __________________________________________________________________________________________________
    lambda_1 (Lambda)               multiple             0           input_1[0][0]                    
                                                                     input_2[0][0]                    
    __________________________________________________________________________________________________
    dense_1 (Dense)                 (None, 10)           110         lambda_1[0][0]                   
    __________________________________________________________________________________________________
    dense_2 (Dense)                 (None, 20)           220         lambda_1[0][1]                   
    __________________________________________________________________________________________________
    dense_3 (Dense)                 (None, 30)           630         lambda_1[1][0]                   
    __________________________________________________________________________________________________
    dense_4 (Dense)                 (None, 40)           840         lambda_1[1][1]                   
    ==================================================================================================
    Total params: 1,800
    Trainable params: 1,800
    Non-trainable params: 0
    __________________________________________________________________________________________________
    

    In the "Connected to" column for a layer the values have a format of: layer_name[x][y]. The layer_name corresponds to the layer where the input tensors of the this layer comes from. For example, all the Dense layers are connected to lmb_layer and therefore get their inputs from this layer. The [x][y] corresponds to the node index (i.e. node_indices) and tensor index (i.e. tensor_indices) of the the input tensors, respectively. For example:

    • The dense_1 layer is applied on a1 which is the first (i.e. index: 0) output tensor of the first (i.e. index: 0) inbound node of lmb_layer, therefore the connectivity is displayed as: lambda_1[0][0].

    • The dense_2 layer is applied on b1 which is the second (i.e. index: 1) output tensor of the first (i.e. index: 0) inbound node of lmb_layer, therefore the connectivity is displayed as: lambda_1[0][1].

    • The dense_3 layer is applied on a2 which is the first (i.e. index: 0) output tensor of the second (i.e. index: 1) inbound node of lmb_layer, therefore the connectivity is displayed as: lambda_1[1][0].

    • The dense_4 layer is applied on b2 which is the second (i.e. index: 1) output tensor of the first (i.e. index: 1) inbound node of lmb_layer, therefore the connectivity is displayed as: lambda_1[1][1].


    That's it! If you want to know more how summary method works, you can take a look at the print_summary function. And if you want to find out how the connections are printed, you can take a look at the print_layer_summary_with_connections function.