Search code examples
utf-8apache-camelunmarshallingbean-iofixed-length-record

unmarshalling fixedlength utf-8 strings with beanio and camel


When there are no diacritic signs that are represented with two bytes, unmarshalling of a message is OK, otherwise it fails complaining about the length. I tried to converty body to type string and set charset utf-8

<convertBodyTo type="java.lang.String" charset="UTF-8" /> 

before unmarshalling using BeanIO in a Camel route, but it doesn't help. What is the right way to solve the problem?

In fact, I think that purpose of convertBodyTo might be not to tell some class that is supposed to do unmarshalling that the actual string although declared fixedlength, might be variable length, but to do actual conversion? But that requires that I tell somewhere first that the actual source is utf-8, probably in from endpoint. Then I can convert it temporarily to some charset that has single byte charset representation before unmarshalling, and back to utf-8 afterwards?

After having a suggestion that the point is to give BeanIO information which charset to use, I came up with:

<dataFormats>
 <beanio id="parseTransactions464" mapping="mapping.xml" streamName="Transactions464" encoding="UTF-8"/>
</dataFormats>

but this gives me:

Exhausted after delivery attempt: 1 caught: java.lang.NullPointerException: charset

I basically copied the usage of encoding with beanio dataFormat from here, I don't know if it is OK:

Cannot find data format in registry - Camel


Solution

  • This is a defect in camel-beanio, see this:

    http://camel.465427.n5.nabble.com/Re-Exhausted-after-delivery-attempt-1-caught-java-lang-NullPointerException-charset-tc5817807.html

    http://camel.465427.n5.nabble.com/Exhausted-after-delivery-attempt-1-caught-java-lang-NullPointerException-charset-tc5817815.html

    https://issues.apache.org/jira/browse/CAMEL-12284