Search code examples
regexmasking

Regex replace any occurence of any digit or any character with desired characters


I need help sorting out a solution for a pattern generator.

I have an application that uses exchange formats with different patterns like

00350-ABA-0NZ0:AXYA-11/11/2012 etc.,

that have numeric and alphanumeric data separated by '-','.',":" and '/'. Now what I want to do is convert this to a generic format like nnnnn-ccc-nccn:cccc-nn/nn/nnnn where n is a digit and c is a character.

Any help/suggestions/ideas . . . Thanks CSK.


Solution

  • You can't do conditional replaces in a single regex. You need to do it in two steps (here's a Python example):

    >>> s = "00350-ABA-0NZ0:AXYA-11/11/2012"
    >>> s = re.sub(r"[A-Za-z]", "c", s)
    >>> s
    '00350-ccc-0cc0:cccc-11/11/2012'
    >>> s = re.sub(r"\d", "n", s)
    >>> s
    'nnnnn-ccc-nccn:cccc-nn/nn/nnnn'
    

    And you need to do it this way around - I just saw your solution in your comment, and if you look at it again, you'll see it won't work. Hint: You'll get 'ccccc-ccc-cccc:cccc-cc/cc/cccc' as a result...

    Another solution would be to use a callback function that examines the match and chooses the replacement string accordingly. But that's not pure regex anymore:

    >>> def replace(m):
    ...     return "n" if m.group(0).isdigit() else "c"
    ...
    >>> s = re.sub(r"[A-Za-z0-9]", replace, s)
    >>> s
    'nnnnn-ccc-nccn:cccc-nn/nn/nnnn'