Search code examples
javaregexsplittokenize

Regex in java for the string


I am very new programmer to Java regular expressions. I do not want to use java split with delimiters and try getting the individual tokens. I don't feel its a neat way. I have the following string

"Some String lang:c,cpp,java file:build.java"

I want to break up this into three parts

1 part containing "Some String" 
2 part containing "c,cpp,java" 
3 String containing "build.java"

The lang: and file: can be placed any where and they are optional.


Solution

  • The lang: and file: can be placed any where and they are optional.

    Try the following expressions to get the language list and the file:

    String input = "Some String lang:c,cpp,java file:build.java";
    String langExpression = "lang:([\\w,]*)";
    String fileExpression = "file:([\w\.]*)";
    
    Patter langPattern = Pattern.compile(langExpression);
    Matcher langMatcher = langPattern.matcher(input);
    if (langMatcher.matches()) {
      String languageList = langMatcher.group(1);
    }
    
    Patter filePattern = Pattern.compile(fileExpression );
    Matcher fileMatcher = filePattern.matcher(input);
    if (fileMatcher .matches()) { 
      String filename= fileMatcher.group(1);
    }
    

    This should work with lang:xxx file:xxx as well as file:xxx lang:xxx as long as the language list or the filename don't contain whitespaces. This would also work if lang: and/or file: was missing.

    Would you also expect a string like this: file:build.java Some String lang:c,cpp,java?