Search code examples
tensorflowdeep-learningcomputer-visionconv-neural-networkobject-detection

How to batch an object detection dataset?


I am working on implementing a face detection model on the wider face dataset. I learned it was built into Tensorflow datasets and I am using it. However, I am facing an issue while batching the data. Since, an Image can have multiple faces, therefore the number of bounding boxes output are different for each Image. For example, an Image with 2 faces will have 2 bounding box, whereas one with 4 will have 4 and so on.

But the problem is, these unequal number of bounding boxes is causing each of the Dataset object tensors to be of different shapes. And in TensorFlow afaik we cannot batch tensors of unequal shapes ( source - Tensorflow Datasets: Make batches with different shaped data). So I am unable to batch the dataset.

So after loading the following code and batching -

ds,info = tfds.load('wider_face', split='train', shuffle_files=True, with_info= True)
ds1 = ds.batch(12)
for step, (x,y,z) in enumerate(ds1) :
 print(step)
 break   

I am getting this kind of error on run Link to Error Image

In general any help on how can I batch the Tensorflow object detection datasets will be very helpfull.


Solution

  • It might be a bit late but I thought I should post this anyways. The padded_batch feature ought to do the trick here. It kind of goes around the issue by matching dimension via padding zeros

    ds,info = tfds.load('wider_face', split='train', shuffle_files=True, with_info= True)
    ds1 = ds.padded_batch(12)
    for step, (x,y,z) in enumerate(ds1) :
     print(step)
     break
    

    Another solution would be to process not use batch and process with custom buffers with for loops but that kind of defeats the purpose. Just for posterity I'll add the sample code here as an example of a simple workaround.

    ds,info = tfds.load('wider_face', split='train', shuffle_files=True, with_info= True)
    batch_size = 12
    image_annotations_pair = [x['image'], x['faces']['bbox'] for n, x in enumerate(ds) if n < batch_size]
    

    Then use a train_step modified for this.

    For details one may refer to - https://www.kite.com/python/docs/tensorflow.contrib.autograph.operators.control_flow.dataset_ops.DatasetV2.padded_batch