Search code examples
javastringqregularexpression

How to use Pattern, Matcher in Java regex API to remove a specific line


I have a complicate string split, I need to remove the comments, spaces, and keep all the numbers but change all string into character. If the - sign is at the start and followed by a number, treat it as a negative number rather than a operator

the comment has the style of ?<space>comments<space>? (the comments is a place holder)

Input :

-122+2 ? comments ?sa b
-122+2 ? blabla ?sa b

output :

 ["-122","+","2","?","s","a","b"]  

(all string into character and no space, no comments)


Solution

    1. Replace the unwanted string \s*\?\s*\w+\s*(?=\?) with "". You can chain String#replaceAll to remove any remaining whitespace. Note that ?= means positive lookahead and here it means \s*\?\s*\w+\s* followed by a ?. I hope you already know that \s specifies whitespace and \w specifies a word character.
    2. Then you can use the regex, ^-\d+|\d+|\D which means either negative integer in the beginning (i.e. ^-\d+) or digits (i.e. \d+) or a non-digit (\D).

    Demo:

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class Main {
        public static void main(String[] args) {
            String str = "-122+2 ? comments ?sa b";
            str = str.replaceAll("\\s*\\?\\s*\\w+\\s*(?=\\?)", "").replaceAll("\\s+", "");
    
            Pattern pattern = Pattern.compile("^-\\d+|\\d+|\\D");
            Matcher matcher = pattern.matcher(str);
            while (matcher.find()) {
                System.out.println(matcher.group());
            }
        }
    }
    

    Output:

    -122
    +
    2
    ?
    s
    a
    b