Search code examples
javastringsubstring

Getting substring of a string that has a repeating character Java


I'm a writing a parser that will extract the tag and value out of a line that it reads from a file and I want to know how to get the value. So in this case I want to get key = "accountName" and value = "fname LName" and have it repeat with each line.

<accountName>fname LName</accountName>
<accountNumber>12345678912</accountNumber>
<accountOpenedDate>20200218</accountOpenedDate>

This is my code, this is within a while loop that is scanning each line using bufferedReader. I managed to get the key properly, but when I try to get the value, I get "String index out of range - 12. Not sure how to get the value between the two arrows > <.

String line;
if(line.startsWith("<"){
    key = line.substring(line.indexOf("<"+1, line.indexOf(">"));
    value = line.substring(line.indexOf(">"+1, line.indexOf("<")+1);
}

Solution

  • You can use regular expressions to extract, assuming the line variable is a string read from each line.

        String pattern = "<([a-zA-Z]+.*?)>([\\s\\S]*?)</[a-zA-Z]*?>";
        // Create a Pattern object
        Pattern r = Pattern.compile(pattern);
        // Now create matcher object.
        Matcher m = r.matcher(line);
        // find
        if (m.find()) {
            String key = m.group(1);
            String value = m.group(2);
            System.out.println("Key: " + key);
            System.out.println("Value: " + value);
        } else {
            System.out.println("Invalid");
        }