Search code examples
tensorflowmatrix-indexing

What is the tensorflow equivalent of numpy tuple/array indexing?


Question

What is the Tensorflow equivalent of Numpy tuple/array indexing to select non-continuous indices? With numpy, multiple rows / columns can be selected with tuple or array.

a = np.arange(12).reshape(3,4)
print(a)
print(a[
    (0,2),   # select row 0 and 2
    1        # select col 0 
])
---
[[ 0  1  2  3]    # a[0][1] -> 1
 [ 4  5  6  7]
 [ 8  9 10 11]]   # a[2][1] -> 9

[1 9]

Looking at Multi-axis indexing but there seems no equivalent way.

Higher rank tensors are indexed by passing multiple indices.

Using the tuple or array causes ypeError: Only integers, slices (`:`), ellipsis (`...`), tf.newaxis (`None`) and scalar tf.int32/tf.int64 tensors are valid indices, got (0, 2, 5).

from tensorflow.keras.layers.experimental.preprocessing import TextVectorization

training_data = np.array([["This is the 1st sample."], ["And here's the 2nd sample."]])
vectorizer = TextVectorization(output_mode="int")
vectorizer.adapt(training_data)

word_indices = vectorizer(training_data)
word_indices = tf.cast(word_indices, dtype=tf.int8)

print(f"word vocabulary:{vectorizer.get_vocabulary()}\n")
print(f"word indices:\n{word_indices}\n")
index_to_word = tf.reshape(tf.constant(vectorizer.get_vocabulary()), (-1, 1))
print(f"index_to_word:\n{index_to_word}\n")

# Numpy tuple indexing
print(f"indices to words:{words.numpy()[(0,2,5),::]}")

# What is TF equivalent indexing?
print(f"indices to words:{words[(0,2,5),::]}")   # <--- cannot use tuple/array indexing

Result:

word vocabulary:['', '[UNK]', 'the', 'sample', 'this', 'is', 'heres', 'and', '2nd', '1st']

word indices:
[[4 5 2 9 3]
 [7 6 2 8 3]]

index_to_word:
[[b'']
 [b'[UNK]']
 [b'the']
 [b'sample']
 [b'this']
 [b'is']
 [b'heres']
 [b'and']
 [b'2nd']
 [b'1st']]

indices to words:[[b'']
 [b'the']
 [b'is']]

TypeError: Only integers, slices (`:`), ellipsis (`...`), tf.newaxis (`None`) and scalar tf.int32/tf.int64 tensors are valid indices, got (0, 2, 5)

What indexing are available in Tensorflow to select non-consecutive multiple indices?


Solution

  • You can use tf.gather.

    >>> tf.gather(words,[0,2,5])
    <tf.Tensor: shape=(3, 1), dtype=string, numpy=
    array([[b''],
           [b'the'],
           [b'is']], dtype=object)>
    

    Read more in the guide: Introduction to tensor slicing