Search code examples
python-3.xregexregex-lookarounds

How to get demangled function name using regex


I have list of demangled-function names like _Z6__comp7StudentS_ _Z4SortiSt6vectorI7StudentSaIS0_EE. I read wiki and found out that it follows some sort of defined structure. _Z is mangled Symbol followed by a number and then the function name of that length.

So I wanted to retrieve that function name using regex. I only come close to _Z(?:\d)(?<function_name>[a-z_A-Z]){\1}. But referring \1 won't work because its string, right? Is there a single regex pattern solution to this.


Solution

  • You can use 2 capture groups, and get the part of the string using the position of capture group 2

    import re
    
    pattern = r"_Z(\d+)([a-z_A-Z]+)"
    s = "_Z4SortiSt6vectorI7StudentSaIS0_EE"
    m = re.search(pattern, s)
    
    if m:
        print(m.group(2)[0: int(m.group(1))])
    

    Output

    Sort
    

    Using _Z6__comp7StudentS_ will return __comp