Search code examples
tensorflowbounding-boxobject-detection-apitfrecord

How to export TFRecords from pts label files for tensorflow object detection api?


usually when we generate TFRecords from xml label files (from labelimg for example), there are the values of x.min, x.max, y.min and y.max, which show a square label. we can make a CSV data out of it and generate the TFRecords from it.

but in the case of pts, the values are as a non-square bounding box, e.g:

bounding_box: 534.588998862 232.095176337; 101.596234357 388.45367463; 51.3295676906 249.25367463; 484.322332196 92.8951763367

so there is four x and y points, not just two as the labelimg gives. can someone explain to me how generate TFRecord from pts?


Solution

  • So just in case anyone else had the same question, i wrote a script that'll make those four points as a square with xmin xmax ymin ymax, so we can get the tfrecord easily as like from xml labelimg.

    here it is:

    import os
    import glob
    import pandas as pd
    from PIL import Image
    import csv
    
    for pts_file in glob.glob("./labels" + '/*.pts'):
        with open(pts_file) as f:
            im=Image.open("./img/" + pts_file[9:-3] + "jpg")
            filename = pts_file[9:-3] + "jpg"
            width = str(im.size[0])
            height = str(im.size[1])
            classs = "fish"
            lines = f.readlines() 
            content = [line.split(' ')for line in open (pts_file)]
            xmax = max(int(float(content[0][1])), int(float(content[0][4])), int(float(content[0][7])), int(float(content[0][10])))
            xmin = min(int(float(content[0][1])), int(float(content[0][4])), int(float(content[0][7])), int(float(content[0][10])))
            ymax = max(int(float(content[0][3][0:5])), int(float(content[0][6][0:5])), int(float(content[0][9][0:5])), int(float(content[0][11][0:5])))
            ymin = min(int(float(content[0][3][0:5])), int(float(content[0][6][0:5])), int(float(content[0][9][0:5])), int(float(content[0][11][0:5])))
    
            fields=[filename,width,height,classs,xmin,ymin,xmax,ymax]
            with open(r'name', 'a', newline='') as f:
                writer = csv.writer(f)
                writer.writerow(fields)        
    
    print('Successfully converted pts to csv.')