Search code examples
javaregexxmlstringstringbuffer

How to capture certain text within a StringBuffer Java


I have a StringBuffer object with the following contents:

<ET>read input: 1.629ms</ET>
<ET>There were 3 errors:
<Error>
    <ErrorId>AllConditionsTrue</ErrorId>
    <MetaData>
        <Entry>
            <Key>Balance Due</Key>
            <Value>1500.99</Value>
        </Entry>
    </MetaData>
</Error>

<Error>
    <ErrorId>Opposite</ErrorId>
    <MetaData>
        <Entry>
            <Key>Node</Key>
        </Entry>
    </MetaData>
</Error>

<Error>
    <ErrorId>minInclusive</ErrorId>
    <MetaData>
        <Entry>
            <Key>Description</Key>
            <Value>Wages Amount</Value>
        </Entry>
    </MetaData>
</Error>

: 0.027ms</ET>
<ET>convert: 319.414ms</ET>
<FORM id="123"/>
<DATA size="11920"/>
<ERROR code="0"/>

How can I capture just the text which is at and within the Error Tags (<Error> some text </Error> ). So my new String or StringBuffer object contains:

<Error>
    <ErrorId>AllConditionsTrue</ErrorId>
    <MetaData>
        <Entry>
            <Key>Balance Due</Key>
            <Value>1500.99</Value>
        </Entry>
    </MetaData>
</Error>

<Error>
    <ErrorId>Opposite</ErrorId>
    <MetaData>
        <Entry>
            <Key>Node</Key>
        </Entry>
    </MetaData>
</Error>

<Error>
    <ErrorId>minInclusive</ErrorId>
    <MetaData>
        <Entry>
            <Key>Description</Key>
            <Value>Wages Amount</Value>
        </Entry>
    </MetaData>
</Error>

How can I accomplish my goal using Java?

Edit

Trying both your guys solutions:

Pattern p = Pattern.compile("<Error>.*?<\\/Error>", Pattern.DOTALL);
Matcher m = p.matcher(buf.toString());

String errorText = "";

while (m.find()) {
    errorText = m.group(1);
}

I seem to only get 3 error tag element not all 3.

Example:

<Error>
    <ErrorId>minInclusive</ErrorId>
    <MetaData>
        <Entry>
            <Key>Description</Key>
            <Value>Wages Amount</Value>
        </Entry>
    </MetaData>
</Error>

Solution

  • Note that your string contains new lines, so you have to use \n. Try this out:

    <Error>((?:.*?\n?)+.*?)<\/Error>
    

    Check the Regex101