Search code examples
pythonpandascomputer-visionprecisionobject-detection

True positive, False positive, False Negative calculation data frame python


I have trained an object detection network. I have a ground truth annotation CSV in the format, filename, height, width, xmin, ymin, xmax, ymax, class. When I score an input image, I get the output CSV in this format: filename, xmin, ymin, xmax, ymax, class, confidence.

I need to merge these data frames considering the IoU. SO basically the output dataframe should contain,

  1. The ground truth and its corresponding prediction along with IoU value if a match is found
  2. The ground truth values and Nan Prediction values if IoU match was not found
  3. Nan ground truth values and prediction values if IoU match was not found.

This will be an intermediate step for calculating Precision and recall values.

I am just adding a very small sample dataframe as an example here, which tests for these conditions.

sample prediction:

        filename  xmin  ymin  xmax  ymax class  confidence
0  dummyfile.jpg  4060  2060  4214  2242    DR    0.999985
1  dummyfile.jpg  3599  1282  3732  1456    DR    0.999900

sample ground truth:

        filename  width  height class  xmin  xmax  ymin  ymax
0  dummyfile.jpg   7201    5400    DR  3598  3728  1279  1451
1  dummyfile.jpg   7201    5400    DR  3916  4038  2186  2274

Expected final output:

final output

I am adding my current approach as an answer. Is there any better way to achieve this? The data can be pretty large.


Solution

  • This is one approach I found,

    1. Defining IoU function:
    import pandas as pd
    import numpy as np
    import os
    
    def IOU(df):
        '''funtion to calulcate IOU within rows of dataframe'''
        # determining the minimum and maximum -coordinates of the intersection rectangle
        xmin_inter = max(df.xmin, df.xmin_pred)
        ymin_inter = max(df.ymin, df.ymin_pred)
        xmax_inter = min(df.xmax, df.xmax_pred)
        ymax_inter = min(df.ymax, df.ymax_pred)
    
        # calculate area of intersection rectangle
        inter_area = max(0, xmax_inter - xmin_inter + 1) * max(0, ymax_inter - ymin_inter + 1)
    
        # calculate area of actual and predicted boxes
        actual_area = (df.xmax - df.xmin + 1) * (df.ymax - df.ymin + 1)
        pred_area = (df.xmax_pred - df.xmin_pred + 1) * (df.ymax_pred - df.ymin_pred+ 1)
    
        # computing intersection over union
        iou = inter_area / float(actual_area + pred_area - inter_area)
    
        # return the intersection over union value
        return iou
    
    1. reading ground truth and prediction CSV
    ground_truth=pd.read_csv("sample_gt.csv")
    prediction=pd.read_csv('sample_preds.csv')
    
    ###renaming prediction df columns with _pred suffix
    pred_cols=prediction.columns.tolist()
    pred_cols.remove('filename')
    new_cols=[col+'_pred' for col in pred_cols ]
    new_col_dict=dict(zip(pred_cols,new_cols))
    prediction.rename(columns=new_col_dict,inplace=True)
    
    1. Outer join ground truth and prediction
    ###outer joining the prediciton and ground truth df
    newdf=pd.merge(prediction,ground_truth,'outer',on='filename')
    
    ###applying iou calculation
    newdf['iou']= newdf.apply(IOU, axis = 1)
    ###filtering all iou=0
    newdf=newdf[newdf['iou']>0]
    

    IoU match

    1. getting non-match values from ground truth and prediction.
    final_df=pd.merge(prediction,newdf,on=prediction.columns.tolist(),how='left')
    final_df=pd.merge(final_df,ground_truth,on=ground_truth.columns.tolist(),how='outer')
    

    final output