Search code examples
pythonpython-3.xpython-3.6python-re

I want to filter the length and breath from a given string in python


I am building a system to calculate the area of the image printed so as to make bills from it.

I have strings like-

"Canvas 36.5 X 48 piece-10"

"wallpaper 3"X27" "(some times we use " to refer to inches)

"Banner 49x87 -10"

"14 Vinyl 38 x 9.7"

"wallpaper 3ftX2Ft PC-1"

and so on like these....

I want to filter floats and intergers from the given data in vba so as to calculate areas of the particular.

like in 1st string i want to fetch 36.5 as length and 48 as breath and 10 as piece and so for the others

so far i am using

findall(r"[-+]?\d*\.\d+|\d+",myStr)

to get all integers and floats and i use first two variables as length and breath but like in "14 Vinyl 38 x 9.7" the len = 38 and breath=9.7 but the algo says 14 and 38 and it is correct on its way, but i want to get the len and breath based on x between them, this will be the correct way of symbolization for them.


Solution

  • Another approach, looks like it should also work as expected:

    import re
    
    
    def l_b_dims_piece(s):
        """Return a tuple of (length, breadth, dimensions, piece)"""
        result = re.findall(r'([\d.]+)\s?(\D{0,2})\s?x\s?([\d.]+)\s?\2?|(\d+)', s, re.IGNORECASE)
    
        piece = 1
        l = b = 0
        dims = None
    
        for l_, dims_, b_, piece_ in result:
            if piece_ != '':
                piece = piece_
            else:
                l, b, dims = l_, b_, dims_.replace('"', 'in', 1)
    
        return l, b, dims, piece
    

    Regex playground for testing: Link