Search code examples
javadom4j

dom4j : adding PI in text content


I have following element :

<text>
text and text and text
<stop/>
text and text and text
<stop/>
text and text and text
<stop/>
</text>

And I want add a processing instruction before&after of all 'and' text. Like this :

<text>
text<?Pub _newline>and<?Pub _newline>text<?Pub _newline>and<?Pub _newline>text
<stop/>
text<?Pub _newline>and<?Pub _newline>text<?Pub _newline>and<?Pub _newline>text
<stop/>
text<?Pub _newline>and<?Pub _newline>text<?Pub _newline>and<?Pub _newline>text
<stop/>
</text>

I dont know how I can add a PI element in text. If I set as string it is escaped : &gt;?Pub _newline&lt;


Solution

  • The text element in your example document contains six nodes as children:

    • three Text nodes, each of which contains the text text and text and text, and
    • three Elements (each of which has the name stop).

    To achieve your desired result, we need to break up each of the Text nodes into Text nodes and Processing Instruction nodes.

    In dom4j we can do this by using the content of the parent element. This method returns a list of all of the child nodes of the element, and if we make changes to this list, the XML document gets updated as well.

    So, we get the element's content list, loop through all of the child nodes, and when we find a Text node that contains and, split that Text node into pieces and insert the new pieces into the list.

    Here's a method that demonstrates this approach. Pass it an element and it will insert the processing instructions as requested:

    import org.dom4j.*;
    import org.dom4j.tree.*;
    
    // ...
    
    public void insertProcessingInstructions(Element element) {
        List nodes = element.content();
        final String splitter = " and ";
        int index = 0;
        while (index < nodes.size()) {
            if (nodes.get(index) instanceof Text) {
                Text textNode = (Text)nodes.get(index);
                String text = textNode.getText();
                int andPos = text.indexOf(splitter);
                if (andPos >= 0) {
                    String beforeText = text.substring(0, andPos);
                    String afterText = text.substring(andPos + splitter.length());
                    textNode.setText(beforeText);
                    nodes.add(index + 1, new DefaultProcessingInstruction("Pub", "_newline"));
                    nodes.add(index + 2, new DefaultText(splitter.trim()));
                    nodes.add(index + 3, new DefaultProcessingInstruction("Pub", "_newline"));
                    nodes.add(index + 4, new DefaultText(afterText));
                    // Move to the last Text node created, in case it contains another
                    // occurrence of the splitter string.
                    index += 4; 
                } else {
                    // No more occurrences of the splitter string in this Text node.
                    ++index;
                }
            } else {
                // Not a Text node.
                ++index;
            }
        }
    }