Search code examples
pythonstringpython-3.xauto-generate

String generation based on the other string in Python


I want to create a simple string generator and here is how it will work

  1. I declare a pattern_string = "abcdefghijklmnopqrstuvwxyz"
  2. My starting string lets say starting_string = "qywtx"
  3. Now I want to generate strings as follows:
  4. Check the last character in my starting_stringagainst the pattern string.
  5. Last character is x. We find this character in the find it in the pattern_string:

    abcdefghijklmnopqrstuvw x yz

    and see that next character is y so I want output qywty. ...

However, when I reach the z, I want my string to increment second last character and set the last character to the first character of the starting_pattern so it will be qywra and so on...

Now questions:

  • Can I use REGEX to achieve that?

  • Are there any libraries out there that already handle such generation?


Solution

  • The following will generate the next string according to your description.

    def next(s, pat):
      l = len(s)
      for i in range(len(s) - 1, -1, -1):  # find the first non-'z' from the back
        if s[i] != pat[-1]:  # if you find it
          # leave everything before i as is, increment at i, reset rest to all 'a's
          return s[:i] + pat[pat.index(s[i]) + 1] + (l - i - 1) * pat[0]
      else:  # this is only reached for s == 'zzzzz'
        return (l + 1) * pat[0]  # and generates 'aaaaaa'  (just my assumption)
    
    >>> import string
    >>> pattern = string.ascii_lowercase  # 'abcde...xyz'
    >>> s = 'qywtx'
    >>> s = next(s, pattern)  # 'qywty'
    >>> s = next(s, pattern)  # 'qywtz'
    >>> s = next(s, pattern)  # 'qywua'
    >>> s = next(s, pattern)  # 'qywub'
    

    For multiple 'z' in the end:

    >>> s = 'foozz'
    >>> s = next(s, lower)  # 'fopaa'
    

    For all 'z', start over with 'a' of incremented length:

    >>> s = 'zzz'
    >>> s = next(s, lower)  # 'aaaa'
    

    To my knowledge there is no library function to do that. One that comes close is itertools.product:

    >>> from itertools import product
    >>> list(map(''.join, product('abc', repeat=3)))
    ['aaa', 'aab', 'aac', 'aba', 'abb', 'abc', 'aca', 'acb', 'acc', 'baa', 
     'bab', 'bac', 'bba', 'bbb', 'bbc', 'bca', 'bcb', 'bcc', 'caa', 'cab',
     'cac', 'cba', 'cbb', 'cbc', 'cca', 'ccb', 'ccc']
    

    But that doesn't not work with an arbitrary start string. This behaviour could be mimicked by combining it with itertools.dropwhile but that has the serious overhead of skipping all the combinations before the start string (which in the case of an alphabet of 26 and a start string towards the end pretty much renders that approach useless):

    >>> list(dropwhile(lambda s: s != 'bba', map(''.join, product('abc', repeat=3))))
    ['bba', 'bbb', 'bbc', 'bca', 'bcb', 'bcc', 'caa', 'cab', 'cac', 'cba', 'cbb', 'cbc', 'cca', 'ccb', 'ccc']