Search code examples
character-encodingxml-parsingfindbugsjcadeployment-descriptor

What should the JCA deployment descriptor (ra.xml) character encoding be?


Looking through JCA 1.7 specification I could only find in one of their examples on the Resource Adapter Deployment Descriptor the following (Chapter 13: Message Inflow P 13-50): JCA DD example showing UTF-8 encoding This example is showing the usage of UTF-8 encoding, however there is nothing saying if this was an optional selection for the example illustration or a must restriction on the file character encoding.

I'm asking this because I'm writing a Java program to read one of these files and FindBugs™ is giving me this message:

DM_DEFAULT_ENCODING: Reliance on default encoding Found a call to a method which will perform a byte to String (or String to byte) conversion, and will assume that the default platform encoding is suitable. This will cause the application behaviour to vary between platforms. Use an alternative API and specify a charset name or Charset object explicitly.

Line 4 in this Java code snippet is where character encoding will be specified:

01.  byte[] contents = new byte[1024];
02.  int bytesRead = 0;
03.  while ((bytesRead = bin.read(contents)) != -1)
04.     result.append(new String(contents, 0, bytesRead));

So, Is it possible to specify the expected encoding of this file in this case or not?


Solution

  • From what I saw, Most people use the UTF-8 encoding for their ra.xml. However there is no restriction on using other encoding. So if you base your parsing to expect UTF-8 only, the result might not be as expected.

    So you either need to count for this in your code when you are reading this as a normal text, or read it as an xml file and save yourself the headache. I don't think the difference in performance will be an issue because the ra.xml files do not usually grow to gigabytes. At least the ones I've seen so far are on an average of few megabytes.

    For the Findbug issue, you just need to specify the encoding as a UTF-8. Otherwise you will be using the default of the JVM which is determined during virtual-machine startup and typically depends upon the locale and charset of the underlying operating system. Although using the default is not a recommended behavior here, if that is what you want then just specify the usage of default encoding. This would get rid of the Findbug issue.

    So your code would look like something like this:

    01. byte[] contents = new byte[1024];
    02. int bytesRead = 0;
    03. while ((bytesRead = bin.read(contents)) != -1)
    04.     result.append(new String(contents, 0, bytesRead, Charset.defaultCharset()));