Is there any way to parse XML comments in Groovy?
Both XMLParser and XMLSluprer don't seem to support comments nodes.
Suppose following file (example.html):
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "">
<html xmlns="" xml:lang="en" lang="en">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<table cellpadding="1" cellspacing="1" border="1">
<tr><td rowspan="1" colspan="3">title</td></tr>
<!--I cannot be seen-->
Here is my code:
def parser = new XmlSlurper(false, false)
parser.setFeature("", false)
parser.setFeature("", false)
def response = parser.parse('example.html')
And when I use
println XmlUtil.serialize(response)
to output the file, no comment can be seen.
as soon as you have html - it's possible to use jsoup to parse
@Grab(group='org.jsoup', module='jsoup', version='1.11.3')
import org.jsoup.Jsoup
import org.jsoup.nodes.Document
def html = '''<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "">
<html xmlns="" xml:lang="en" lang="en">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<table cellpadding="1" cellspacing="1" border="1">
<tr><td rowspan="1" colspan="3">title</td></tr>
<!--I cannot be seen-->
Document doc = Jsoup.parse(html)
println'html body table tbody').first()?.childNodes()?.find{it.nodeName()=='#comment'}?.getData()