Search code examples
pythonurlurlparse

Change url in python


how can I change the activeOffset in this url? I am using Python and a while loop

https://www.dieversicherer.de/versicherer/auto---reise/typklassenabfrage#activeOffset=10&orderBy=kh&orderDirection=ASC

It first should be 10, then 20, then 30 ...

I tried urlparse but I don't understand how to just increase the number

Thanks!


Solution

  • If this is a fixed URL, you can write activeOffset={} in the URL then use format to replace {} with specific numbers:

    url = "https://www.dieversicherer.de/versicherer/auto---reise/typklassenabfrage#activeOffset={}&orderBy=kh&orderDirection=ASC"
    
    for offset in range(10,100,10):
      print(url.format(offset))
    

    If you cannot modify the URL (because you get it as an input from some other part of your program), you can use regular expressions to replace occurrences of activeOffset=... with the required number (reference):

    import re
    
    url = "https://www.dieversicherer.de/versicherer/auto---reise/typklassenabfrage#activeOffset=10&orderBy=kh&orderDirection=ASC"
    
    query = "activeOffset="
    pattern = re.compile(query + "\\d+") # \\d+ means any sequence of digits
    
    for offset in range(10,100,10):
      # Replace occurrences of pattern with the modified query
      print(pattern.sub(query + str(offset), url))
    

    If you want to use urlparse, you can apply the previous approach to the fragment part returned by urlparse:

    import re
    
    from urllib.parse import urlparse, urlunparse
    
    url = "https://www.dieversicherer.de/versicherer/auto---reise/typklassenabfrage#activeOffset=10&orderBy=kh&orderDirection=ASC"
    
    query = "activeOffset="
    pattern = re.compile(query + "\\d+") # \\d+ means any sequence of digits
    
    parts = urlparse(url)
    
    for offset in range(10,100,10):
      fragment_modified = pattern.sub(query + str(offset), parts.fragment)
      parts_modified = parts._replace(fragment = fragment_modified)
      url_modified = urlunparse(parts_modified)
      print(url_modified)