I'am using camel to create a JAXB object, marshall it and write then the result in UTF-8 encoded XML file. Some of my xml content is fetched from a datasource which is using an ISO 8859-1 encoding:
hier is my camel route:
import org.apache.camel.converter.jaxb.JaxbDataFormat;
JaxbDataFormat jaxbDataFormat = new JaxbDataFormat(Claz.class.getPackage().getName());
from("endpoint")
.process(//createObjectBySettingTheDataFromSource)
.marshal(jaxbDataFormat)
.to(FILEENDPOINT?charset=utf-8&fileName=" +Filename);
The XML is generated successfully, but the data content fetched from the source still in the ISO encoding and not resolved with UTF8.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Name>M��e Faࠥnder</Name> //Mürthe Faßender
by changing the file encoding to ISO 8859-1 the content is resolved successfully.
I tried to convert the data before setting it in the JAXB object but still not resolved in UTF-8.
byte[] nameBytes = name.getBytes(StandardCharsets.ISO_8859_1);
return new String(nameBytes, StandardCharsets.UTF_8);
The problem is only accuring under Linux, does any one have an idea how to manipulate the ISO_8859_1 data and set it without issues in the xml ?
Well, UTF-8 is the default charset (at least for the file endpoint) and AFAIK Camel does not try to analyze the given charset of an input message.
So I guess that if you don't declare an input charset different than UTF-8 and then write a file as UTF-8 there is no need to convert anything from Camels perspective.
.from("file:inbox") // implicit UTF-8
.to("file:outbox?charset=utf-8") // same charset, no conversion needed
You can, at least for files, declare the source encoding so that Camel knows it must convert the payload.
.from("file:inbox?charset=iso-8859-1")
.to("file:outbox?charset=utf-8") // conversion needed
If you cannot declare the input charset (I think this depends on the endpoint type), you have to explicitly convert the payload.
.from("file:inbox")
.convertBodyTo(byte[].class, "utf-8")
// message body is now a byte array and written to file as is
.to("file:outbox")
See the section "Using charset" from the Camel File docs for more details.