Search code examples
regexregex-lookaroundsregex-groupregex-greedypython-textfsm

How to extract comma separated substrings from a string?


Need to parse the algorithms separated by comma in group.

SSH Enabled - version 2.0
Authentication methods:publickey,keyboard-interactive,password
Encryption Algorithms:aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc
MAC Algorithms:hmac-sha1,hmac-sha1-96
Authentication timeout: 120 secs; Authentication retries: 3
Minimum expected Diffie Hellman key size : 1024 bits
IOS Keys in SECSH format(ssh-rsa, base64 encoded):

I have tried to separate them with comma but not getting the expected results:

^Encryption Algorithms:(.*?)(?:,|$)

Expected results is to have each algorithm in group 1 with no empty group

aes128-ctr
aes192-ctr
aes256-ctr
aes128-cbc
3des-cbc
aes192-cbc
aes256-cbc

Solution

  • It may not be the best way, however it might be one way to split our string into three parts, maybe even before running it through RegEx engine. If that would not be the case and we wish to have an expression, this might be close:

    (.+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC.+)
    

    enter image description here


    If you also have new lines, you might want to test with other expressions, maybe similar to:

    ([\s\S]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[\s\S]+)
    

    ([\w\W]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[\w\W]+)
    

    ([\d\D]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[\d\D]+)
    

    Demo 1

    Demo 2

    RegEx

    If this expression wasn't desired, it can be modified or changed in regex101.com.

    RegEx Circuit

    jex.im visualizes regular expressions:

    enter image description here

    Test

    # coding=utf8
    # the above tag defines encoding for this document and is for Python 2.x compatibility
    
    import re
    
    regex = r"([\w\W]+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC[[\w\W]+)"
    
    test_str = ("SSH Enabled - version 2.0\n"
        "Authentication methods:publickey,keyboard-interactive,password\n"
        "Encryption Algorithms:aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc\n"
        "MAC Algorithms:hmac-sha1,hmac-sha1-96\n"
        "Authentication timeout: 120 secs; Authentication retries: 3\n"
        "Minimum expected Diffie Hellman key size : 1024 bits\n"
        "IOS Keys in SECSH format(ssh-rsa, base64 encoded):\n")
    
    subst = "\\2 "
    
    # You can manually specify the number of replacements by changing the 4th argument
    result = re.sub(regex, subst, test_str, 0, re.MULTILINE)
    
    if result:
        print (result)
    
    # Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.
    

    Demo

    const regex = /(.+Encryption Algorithms:)|([a-z0-9-]+)(?:,|\s)|(MAC.+)/gm;
    const str = `SSH Enabled - version 2.0 Authentication methods:publickey,keyboard-interactive,password Encryption Algorithms:aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc MAC Algorithms:hmac-sha1,hmac-sha1-96 Authentication timeout: 120 secs; Authentication retries: 3 Minimum expected Diffie Hellman key size : 1024 bits IOS Keys in SECSH format(ssh-rsa, base64 encoded):`;
    const subst = `$2 `;
    
    // The substituted value will be contained in the result variable
    const result = str.replace(regex, subst);
    
    console.log('Substitution result: ', result);