Search code examples
javaregexregex-lookaroundsregex-group

Regex to get everything but last chars of capture group


How do I use regex to select everything before last 4 char in a capture group?

Example:

String str = "{Index1=StudentData(studentName=Sam, idNumber=321231312), Index2=StudentData(studentName=Adam, idNumber=5675), Index3=StudentData(studentName=Lisa, idNumber=67124)}";
String regex = "(?<=idNumber=)[a-zA-Z1-9]+(?=\))";

System.out.println(str.replaceAll(regex, "*"));

Current output:

{Index1=StudentData(studentName=Sam, idNumber=*), Index2=StudentData(studentName=Adam, idNumber=*), Index3=StudentData(studentName=Lisa, idNumber=*)}

Desired output:

{Index1=StudentData(studentName=Sam, idNumber=*****1312), Index2=StudentData(studentName=Adam, idNumber=5675), Index3=StudentData(studentName=Lisa, idNumber=*7124)

Solution

  • You can use this regex in Java:

    (\hidNumber=|(?!^)\G)[a-zA-Z1-9](?=[a-zA-Z1-9]{4,}\))
    

    And replace with $1*.

    RegEx Demo

    Java Code:

    final String re = "(\\hidNumber=|(?!^)\\G)[a-zA-Z1-9](?=[a-zA-Z1-9]{4,}\\));
    String r = s.replaceAll(re, "$1*");
    

    Breakdown:

    • (: Start capture group #1
      • \h: Match a whitespace
      • idNumber=: Match text idNumber=
      • |: OR
      • (?!^)\G: Start at the end of the previous match
    • ): Close capture group #1
    • [a-zA-Z1-9]: Match an ASCII letter or digit 1-9
    • (?=[a-zA-Z1-9]{4,}\)): Make sure that ahead of current position we have at least 4 ASCII letters or digits 1-9 followed by )