I have been struggling for more than a week on this issue. I read probably more than 50 different pages about it but I can't find the solution in my case.
Certainly my question would appear as duplicate if there isn't one especific point: my code does work in Windows and, same code, while running in Unix cause the issue on this topic.
Basically, all searches done in forums drove me to understand that it is matter of BOM. I followed all suggestions and my code keeps working in Windows but it causes same issue in Unix Mainframe.
Find below the most relevant steps in my code and commented tentatives that I have tried. It is hard to imagine anything else to do since since the begining my code is running in Windows but causing the Cotent issue only in Unix Mainframe
First Step: serialise a file to DOM object
Element txns = q.parseMHEFile(path to my file);
DOMImplementationLS lsImpl = (DOMImplementationLS) txns.getOwnerDocument().getImplementation().getFeature("LS", "3.0");
LSSerializer serializer = lsImpl.createLSSerializer();
serializer.getDomConfig().setParameter("xml-declaration", false);
String result = serializer.writeToString(txns);
log.info(result); //I sse here same result both in Windows as in Unix
Document d2 = convertStringToDocument(result);
q.addMessages( d2.getDocumentElement());
Second Step: there are a very comple flow changing and adding new fields. At the end save in certain temp file with this method:
synchronized protected void writeToFile(Node node, String file)
throws SAXException, IOException {
try {
StringWriter output = new StringWriter();
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.transform(new DOMSource(node), new StreamResult(output));
String xml = output.toString();
Integer whereIs = xml.indexOf("<?xml");
/*both in Windows as in Unix I will find <?xml in position 0, so no extra character before <?xml */
if (whereIs >= 0) {
log.info("<?xml is in " + whereIs + " position");
}
FileWriter filewriter = new FileWriter(file);
/* The replace below was a clue found in some forum for taking the BOM out in case it exists */
filewriter.write(((xml.replace("\uFEFF", "")).replace("\uFEFF", "")).replace("\uFFFE", ""));
filewriter.close();
} catch (Exception ex) {
System.out.println(ex.getMessage());
}
}
Third Step: while parsing the temp file I get the error. See below two ways I have tried and both runs in WIndows but not in Unix
//version before I read several forums pointing the BOM issue
public Node readFromFile(String file) throws ParserConfigurationException {
DocumentBuilderFactory docFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document d = null;
try {
d = docBuilder.parse(file);
} catch (Exception e) {
System.out.println(e.getMessage());
}
return d.getDocumentElement();
}
//version after some clues found in forums related to BOM issue public Node readFromFile(String file) {
try {
java.io.File f = new java.io.File(file);
java.io.InputStream inputStream = new java.io.FileInputStream(f);
// Checking if there is BOM
BOMInputStream bomIn = new BOMInputStream(inputStream,ByteOrderMark.UTF_8, ByteOrderMark.UTF_16LE, ByteOrderMark.UTF_16BE);
//it always show that there is no BOM in both Windows as Unix
if (bomIn.hasBOM() == false) {
log.info("No BOM found");
}
java.io.Reader reader = new java.io.InputStreamReader(inputStream,"UTF-8");
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document d = null;
log.info("Before parsing file"); //this is the last log while in Unix before the below error
/*Next line will cause issue only in Unix
ÝFatal Error¨ myFile.xml:1:39: Content is not allowed in prolog.
Content is not allowed in prolog.*/
d = docBuilder.parse(is);
log.info("After parsing file"); //this will be showed while in Windows
return d.getDocumentElement();
} catch (Exception e) {
log.info(e.getMessage());
return null;
}
}
POM:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.mycomp.batchs</groupId>
<artifactId>AuthorizationFileToICTTQueue</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>AuthorizationFileToICTTQueue</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<spring.framework.version>4.2.4.RELEASE</spring.framework.version>
<spring.batch.version>3.0.6.RELEASE</spring.batch.version>
<log4j.version>1.2.7</log4j.version>
<java.version>1.7</java.version>
<maven.compiler.plugin.version>2.1</maven.compiler.plugin.version>
<hsqldb.version>1.8.0.10</hsqldb.version>
<logback-classic.version>1.1.5</logback-classic.version>
</properties>
<dependencies>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.4</version>
</dependency>
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-core</artifactId>
<version>${spring.batch.version}</version>
</dependency>
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-infrastructure</artifactId>
<version>${spring.batch.version}</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>${logback-classic.version}</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-tx</artifactId>
<version>${spring.framework.version}</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-jdbc</artifactId>
<version>${spring.framework.version}</version>
</dependency>
<dependency>
<groupId>hsqldb</groupId>
<artifactId>hsqldb</artifactId>
<version>${hsqldb.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>${maven.compiler.plugin.version}</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
**** Edited in 18/Feb/2016 01:00Pm Brasilia Timezone Transfered from zOS/390 using OpenText Connectivity - Connection Central for x64 First Image shows the file transfered as ASCII. Second image shows the file transfered as Binary
Sounds like a character-set issue, the XML prolog might be
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
and if your *nix install, for whatever reason, doesn't support UTF then the file is not going to be formatted correctly. Could it be when you created/copied the document to *nix that the character set got screwed up and isn't the UTF-8 you expected? Might make sense to examine the file with a hex editor on both platforms.
I know I've run into this before, though usually the other way, but I don't have a current example where it doesn't work, just know it was a character set issue.