Search code examples
pythonpython-3.xstringurlparse

Manipulate url to return length of string/number after special characters


Given a URL, I want to be able to get the number of characters (s if non digit character and d for digit character) after each special character. For example, for a URL like this:

url="https://imag9045.pexels1.pre45p.com/photos/414612/pexels-photo-414612.jpeg"

I want the output to be: '4s.4d.6s.1d.4s.2d.com/6s/6d/6s-5s-6d.'

The code I have below only generates the desired result before the domain (before '.com'). I am having issues generating the rest.

     How can I manipulate it to get the desired output (`'4s.4d.6s.1d.4s.2d.com/6s/6d/6s-5s-6d.'`)? 

Solution

  • You will need to loop on every character, as in

    import string
    def mysplit(path):
        s=d=0
        out=''
        for c in path:
            if c in string.punctuation:
                if s:
                    out += f'{s}s'
                    s=0
                if d:
                    out += f'{d}d'
                    d=0
                out += c
            elif c in string.digits:
                d+=1
            else:
                s+=1
        if s:
            out += f'{s}s'
        if d:
            out += f'{d}d'
        return out
    
    >>> mysplit('/photos/414612/pexels-photo-414612.jpeg')
    '/6s/6d/6s-5s-6d.4s'
    

    Apart from handling the top level domain name, the above function may be used for the first section of the url as well

    >>> mysplit('https://imag9045.pexels1.pre45p.com')
    '5s://4s4d.6s1d.4s2d.3s'