Search code examples
pythonartificial-intelligencecaptchayolov5

Generate possible solutions to image captcha from a list of each class with confidence


I have a YOLOV5 model that uses object detection to solve captchas. It returns a list in the format:

<object-class> <x> <y> <width> <height> <confidence>

and for every class detection.

Example:

Input:

1

Output:

[[
['7l', 0.19, '0.443182', '0.104895', '0.431818', '0.972055'], 
['4l', 0.33, '0.534091', '0.104895', '0.431818', '0.965045'], 
['5l', 0.6, '0.238636', '0.118881', '0.431818', '0.974508'], 
['9l', 0.92, '0.659091', '0.104895', '0.409091', '0.879532'], 
['0l', 0.93, '0.659091', '0.0979021', '0.363636', '0.651053']
]]

As you can see, the classes 9l and 0l have the same x value meaning the model has two answers for one object.

How can I split this list into two possible lists like:

7l 4l 5l 9l and 7l 4l 5l 0l


Solution

  • First you can convert data to dictionary

    {'0.443182': ['7l'], '0.534091': ['4l'], '0.238636': ['5l'], '0.659091': ['9l', '0l']}
    

    Next get only values

    [['7l'], ['4l'], ['5l'], ['9l', '0l']]
    

    And finally use itertools.product(['7l'], ['4l'], ['5l'], ['9l', '0l']) to generate

    ('7l', '4l', '5l', '9l')
    ('7l', '4l', '5l', '0l')
    

    Full working code.

    Because standard dictionary doesn't have to keep order so I use OrderedDict()

    import collections
    import itertools
    
    data = [[
    ['7l', 0.19, '0.443182', '0.104895', '0.431818', '0.972055'], 
    ['4l', 0.33, '0.534091', '0.104895', '0.431818', '0.965045'], 
    ['5l', 0.6, '0.238636', '0.118881', '0.431818', '0.974508'], 
    ['9l', 0.92, '0.659091', '0.104895', '0.409091', '0.879532'], 
    ['0l', 0.93, '0.659091', '0.0979021', '0.363636', '0.651053']
    ]]
    
    #converted = {}
    converted = collections.OrderedDict()
    
    for item in data[0]:
        class_ = item[0]
        x      = item[2]
        #if x not in converted:
        #    converted[x] = []
        #converted[x].append(class_)
        converted.setdefault(x, []).append(class_)
            
    print('converted:', converted)
    
    values = list(converted.values())
    
    print('values:', values)
    
    products = list(itertools.product(*values))
    print('products:', products)
    
    for item in products:
        print('item:', "".join(item))
    

    Result:

    converted: OrderedDict([('0.443182', ['7l']), ('0.534091', ['4l']), ('0.238636', ['5l']), ('0.659091', ['9l', '0l'])])
    values: [['7l'], ['4l'], ['5l'], ['9l', '0l']]
    products: [('7l', '4l', '5l', '9l'), ('7l', '4l', '5l', '0l')]
    item: 7l4l5l9l
    item: 7l4l5l0l