I have a problem. I have linebreaks and spaces and tabs in XML. Like this:
<value xs:type="DV_TEXT"><value>1111\this is what it is used for, this could be a
really long line or even
multiple lines, just like
what you are reading now
</value></value>
The setTextContent and getTextContent in Java from org.w3c.dom deal just fine with it. No problem.
But now, I am generating Schematron for validation to check if this string really appears in the value. The Schematron is generated from a definition file in which the test-strings are configurated
The generated Schematron, the assert-test looks like this:
test="(matches(.,'1111\this is what it is used for, this could be a really long line or even
multiple lines, just like
what you are reading now'))"
And then when I validate, there are more problems coming up.
First the linebreaks. It seems that in the definition-file from which the Schematron is generated there are \r\n
instead of only \n
.
But well, I have to count on that. If I replace all 

with only 

some of the errors are disappeared. And how can I be sure that the XML-file also has only 

as linebreak?
I think I need to alter the string which comes in the test asserts, and, for example, replace all \r\n
with only \n
.
I have done that, and it solves partly my problem. What else should I think about?
All tips are welcome.
If you want the node text to be valid regardless of its whitespace use the normalize-space function function:
The normalize-space function returns the argument string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space. [...]
So, this should work:
test="(matches(normalize-space(.),'1111\this is what it is used for, this could be a really long line or even multiple lines, just like what you are reading now'))