I am using XmlBeans 2.6.0 to compile some XSD files that contain an enumeration of Greek words:
<xs:simpleType name="t_series_report">
<xs:restriction base="xs:string">
<xs:enumeration value="Γενική"/>
<xs:enumeration value="Ειδική"/>
</xs:restriction>
</xs:simpleType>
The compilation is performed using the Ant task included in the xbean.jar of the ZIP binary distribution of XmlBeans. The XSD files are saved as utf-8 and this is also correctly stated in the header java files
<?xml version="1.0" encoding="UTF-8"?>
The problem is that the Java files generated by XmlBeans seem to be saved in ANSI character set and during compilation I get errors like:
[xmlbean] C:\projects\myproject\workspace\prj\build\xmlbeans\test\src\com\company\project\schema\myschematype\cl\cle\ext\TMyType.java:61: illegal character: \8220
[xmlbean] static final int INT_ΓΕ�?ΙΚΉ = 1;
[xmlbean]
Is there any way to force XmlBeans to save the generated Java files as UTF-8 instead of ANSI?
We had a similar problem compiling some schema containing an greek "Omega" using XMLBeans' maven task.
Problem is, that XMLBeans (at least as of version 2.5.0) always uses Javas platform default encoding, which may only be set by invoking the JVM with a -Dfile.encoding=UTF-8
.
For our Maven project, the solution was to NOT use the plugin; instead we invoked XMLBeans using the exec
plugin, so we had control on the encoding. Here is a snippet of the pom.xml
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<executions>
<execution>
<id>exec-2.1.0</id>
<phase>generate-sources</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<executable>java</executable>
<arguments>
<argument>-Dfile.encoding=${project.build.sourceEncoding}</argument>
<argument>-classpath</argument>
<!-- automatically creates the classpath using all project dependencies,
also adding the project build directory -->
<classpath/>
<argument>org.apache.xmlbeans.impl.tool.SchemaCompiler</argument>
<argument>-src</argument>
<argument>${project.build.directory}/generated-sources</argument>
<argument>-srconly</argument>
<argument>-d</argument>
<argument>${project.build.directory}/classes</argument>
<argument>-javasource</argument>
<argument>1.6</argument>
<argument>${basedir}/src/main/2.1.0/schema/</argument>
<argument>src/main/2.1.0/config/FooBar_v2.1.0.xsdconfig</argument>
</arguments>
</configuration>
</execution>
I suppose this approach would be adaptable to Ant
as well.
Easier solution would be to invoke ant like this:
ant -Dfile.encoding=UTF-8 build-or-whatever
But this will obviously only work if all your source files are in UTF-8!