I am making a chatting application through smack api. When I send message which include this character '
,
the output comes as
message== ma'am
output==
ma
'
am
here is the code
StringEscapeUtils.unescapeHtml((new String(ch, start, length).replace("'", "`").replace("'", "'")));
here is the code
DefaultHandler handler = new DefaultHandler() {
@Override
public void startDocument() throws SAXException {
}
@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
for (int i = 0; i < attributes.getLength(); i++) {
if (attributes.getLocalName(i).equalsIgnoreCase("from")) {
from = attributes.getValue(i);
break;
}
}
....
}
@Override
public void characters(char ch[], int start, int length) throws SAXException {
String str = StringEscapeUtils.unescapeHtml((new String(ch, start, length)));
switch (elementType) {
case 1:
msg = str;
break;
...
default:
...
break;
}
//
@Override
public void endElement(String uri, String localName, String qName) throws SAXException {
}
@Override
public void endDocument() throws SAXException {
}
Very often, XML parsers will break text elements into multiple character nodes. This is perfectly valid from an XML point of view. So you will need to handle this appropriately. So maybe the problem arises from printing, not the unescaping.
E.g. I can imagine the following XML
<n>A & B</n>
producing the following events:
n
A
"&
"B
"n
Now if you println
every character "thing" you see, you'll get three lines instead of one. Maybe your parser has an option to enforce "normalizing" the events to join succssive text nodes.
(Sorry if I'm not using all the appropriate XML teminiology. My XML terminilogy has become a bit rusty, so feel free to edit this question and put in the correct XML terms. Thank you)