Search code examples
tensorflowkerassliceembedding

How to create a sub-tensor from a given tensor by selecting windows around some values of this tensor?


My question is similar to the one asked here. The difference is that I would like to have a new tensor B that's a concatenation of some selected windows from the initial tensor A. The objective is to do it with a priori unknown tensors, ie: Input layers. Here is an example using defined constants just to explain what I would like to do:

Given 2 input tensors of 3-dim embeddings:

A = K.constant([[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4], [5, 5, 5], [6, 6, 6], [7, 7, 7], [2, 2, 2], [8, 8, 8], [9, 9, 9], [10, 10, 10]])
t = K.constant([[2, 2, 2], [6, 6, 6], [10, 10, 10]])

I would like to create a tensor B that is a concatenation of the following sub-tensors (or windows) selected from A and that correspond to occurence neighbourhood of each element in t:

# windows of 3 elements, each window is a neighbourhood of a corresponding element in t
window_t_1 = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]  # 1st neighbourhood of [2, 2, 2] 
window_t_2 = [[7, 7, 7], [2, 2, 2], [8, 8, 8]]  # 2nd neighbourhood of [2, 2, 2] (because it has 2 occurences in A)
window_t_3 = [[5, 5, 5], [6, 6, 6], [7, 7, 7]]  # unique neighbourhood of [6, 6, 6]
window_t_4 = [[8, 8, 8], [9, 9, 9], [10, 10, 10]]  # unique neighbourhood of [10, 10, 10]
# B must contain these selected widows:
B = [[1, 1, 1], [2, 2, 2], [3, 3, 3], [7, 7, 7], [2, 2, 2], [8, 8, 8], [5, 5, 5], [6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9], [10, 10, 10]]

The objective is to apply this process to reformulate the Input tensors of my model, instead of pre-defined constants. So, how can I do this given the two inputs of my model:

in_A = Input(shape=(10,), dtype="int32")
in_t = Input(shape=(3,), dtype="int32")
embed_A = Embedding(...)(in_A)
embed_t = Embedding(...)(in_t)
B = ...  # some function or layer to create the tensor B as described in the example above using embed_A and embed_t
# B will be used then on the next layer like this:
# next_layer = some_other_layer(...)(embed_t, B)

Or selecting the sub-tensor elements then apply the embedding layer:

in_A = Input(shape=(10,), dtype="int32")
in_t = Input(shape=(3,), dtype="int32")
B = ...  # some function to select the desired element windows as described above
embed_B = Embedding(...)(B)
embed_t = Embedding(...)(in_t)
# then add the next layer like this:
# next_layer = some_other_layer(...)(embed_t, embed_B)

Thanks in advance.


Solution

  • import tensorflow as tf
    from tensorflow.contrib import autograph
    # you can uncomment next line to enable eager execution to see what happens at each step, you'd better use the up-to-date tf-nightly to run this code
    # tf.enable_eager_execution()
    A = tf.constant([[1, 1, 1],
                     [2, 2, 2],
                     [3, 3, 3],
                     [4, 4, 4],
                     [5, 5, 5],
                     [6, 6, 6],
                     [7, 7, 7],
                     [2, 2, 2],
                     [8, 8, 8],
                     [9, 9, 9],
                     [10, 10, 10]])
    
    t = tf.constant([[2, 2, 2],
                     [6, 6, 6],
                     [10, 10, 10]])
    
    # expand A in axis 1 to compare elements in A and t with broadcast
    expanded_a = tf.expand_dims(A, axis=1)
    
    # find where A and t are equal with each other
    equal = tf.equal(expanded_a, t)
    reduce_all = tf.reduce_all(equal, axis=2)
    # find the indices
    where = tf.where(reduce_all)
    where = tf.cast(where, dtype=tf.int32)
    
    # here we want to a function to find the indices to do tf.gather, if a match 
    # is found in the start or
    # end of A, then pick up the two elements after or before it, otherwise the 
    # left one and the right one along with itself are used
    @autograph.convert()
    def _map_fn(x):
        if x[0] == 0:
            return tf.range(x[0], x[0] + 3)
        elif x[0] == tf.shape(A)[0] - 1:
            return tf.range(x[0] - 2, x[0] + 1)
        else:
            return tf.range(x[0] - 1, x[0] + 2)
    
    
    indices = tf.map_fn(_map_fn, where, dtype=tf.int32)
    
    # reshape the found indices to a vector
    reshape = tf.reshape(indices, [-1])
    
    # gather output with found indices
    output = tf.gather(A, reshape)
    

    A custom layer can be easily written as long as you understand this code