Search code examples
javarssspecial-charactersatom-feedsimple-framework

How can I get my Parsing ATOM feed with SimpleXML (java) to return ellipsis instead of &#8230


I have a line of XML in my Atom feed (UTF-8) formatted with an ellipsis, like this.

<title type="html"><![CDATA[THIS WEEK IN HISTORY&#8230;]]></title>

To access the title, I call title.getText().

  • Actual result: THIS WEEK IN HISTORY&#8230;
  • Expected result: THIS WEEK IN HISTORY…

Here's my Title class. What am I doing wrong with SimpleXML?

    public static class Title {

        @Attribute(name = "type", required = false)
        String type;
        @Text
        String text;

        public String getText() {
            return this.text;
        }

        void setText(String text) {
            this.text = text;
        }

        public String getType() {
            return this.type;
        }

        public void setType(String _value) {
            this.type = _value;
        }
    }

Solution

  • StringEscapeUtils.escapeHtml4() from the Apache Commons Lang library