I have a XML file which contain quotes as follows
<feast key="NAME" value="NAME TEST 'xxxxx"yyyy' $"/>
I need to replace xxxxx"yyyy
to xxxxx"yyyy
in all occurrence.
Note: xxxxx and yyyy are defined by user. So it can be of any form.
Here i included the sample XML and code to parse.
TestSaxParse.xml
<?xml version="1.0" encoding="US-ASCII" ?>
<TEST Office="TEST Office">
<LINE key="112313133320">
<TESTNO value="0"/>
<FEATURE>
<feast key="001" value="001"/>
<feast key="NAME" value="NAME TEST 'xxxxx_&_yyyy' $"/>
</FEATURE>
</LINE>
<LINE key="112313133321">
<TESTNO value="0"/>
<FEATURE>
<feast key="002" value="002"/>
<feast key="NAME" value="NAME TEST 'xxxxx"yyyy' $"/>
</FEATURE>
</LINE>
</TEST>
SaxParseEx.java
import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SaxParseEx extends DefaultHandler{
private static String xmlFilePath = "/home/system/TestSAXParse.xml";
public static void main(String[] args) {
SaxParseEx SaxParseEx = new SaxParseEx();
SAXParserFactory fact = SAXParserFactory.newInstance();
SAXParser parser;
try {
Path path = Paths.get(xmlFilePath);
Charset charset = StandardCharsets.UTF_8;
String content = new String(Files.readAllBytes(path), charset);
// replace & with &
content = content.replaceAll( "(&(?!amp;))", "&");
// content = content.replaceAll( "(\"(?!quot;))", """); Need regex to replace " with " only on specific place where i mentioned above
// Write updated content to XML file
Files.write(path, content.getBytes(charset));
// XML Parsing
parser = fact.newSAXParser();
parser.parse(new File(xmlFilePath), SaxParseEx);
System.out.println("PARSE SUCCESS");
return;
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("PARSE FAILED");
}
}
O/P
org.xml.sax.SAXParseException; systemId: file:/home/system/TestSAXParse.xml; lineNumber: 14; columnNumber: 46; Element type "feast" must be followed by either attribute specifications, ">" or "/>".
I have replace all &
with &
to fix the SAXParseException on Line No. 7. I cannot replace "
with "
EDIT:
I cannot use this answer. I'm looking for different solution because of
content = content.replaceAll( "(&(?!amp;))", "&");
Is there any possibility to write a regex like that?
I replaced all "
with "
when it is enclosed with '
. So i added below lines before to Files.write
Pattern pattern = Pattern.compile("'(.*[\"].*)'");
Matcher matcher = pattern.matcher(content);
while (matcher.find()) {
content = content.replaceAll(matcher.group(1), matcher.group(1).replace("\"", """));
}