Search code examples
pythontensorflowdataset

Can't convert non-rectangular Python sequence to Tensor with tf.data.Dataset.from_tensor_slices


I want to create Tensorflow dataset using tf.data.Dataset.from_tensor_slices but I got this error: Can't convert non-rectangular Python sequence to Tensor.

To simplify the issue, I took the following example which is similar to my data:

import tensorflow as tf
 
data =  ['A', 'B']

label  = [ ['a1', 'a2', 'a3'] , ['b1', 'b2', 'b3' , 'b4' ] ]

dataset = tf.data.Dataset.from_tensor_slices((data , label))

This problem appears because the lists in label are not equal, len(['a1', 'a2', 'a3']) not equal len(['b1', 'b2', 'b3' , 'b4' ]). I want to keep the data as it is without using padding. I tried tf.ragged.constant and other solutions in this site but it did not work with me.


Solution

  • This should work,

    dataset = tf.data.Dataset.from_tensor_slices((data , tf.ragged.constant(label)))
    
    for data in dataset.as_numpy_iterator():
        print(data)
    
    #outputs
    (b'A', array([b'a1', b'a2', b'a3'], dtype=object))
    (b'B', array([b'b1', b'b2', b'b3', b'b4'], dtype=object))