Multiple XML files were concatenated into one file, see below a demo example. How it is possible to validate it using either xmlstarlet
or xmllint
command?
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes" ?>
<BookHeaderMsg xmlns:xsi="THE URL" xsi:noNamespaceSchemaLocation="NAME.xsd">
<BookHdr>
<tag>value</tag>
<tag2>value</tag2>
</BookHdr>
<Payload>
<payloadTag>value</payloadTag>
<payloadTag2>value</payloadTag2>
</Payload>
</BookHeaderMsg>
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes" ?>
<BookTransfer xmlns:xsi="THE URL" xsi:noNamespaceSchemaLocation="NAME.xsd">
<BookHdr>
<tag>value</tag>
<tag2>value</tag2>
</BookHdr>
<Payload>
<payloadTag>value</payloadTag>
<payloadTag2>value</payloadTag2>
</Payload>
</BookTransfer>
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes" ?>
<BookTransfer xmlns:xsi="THE URL" xsi:noNamespaceSchemaLocation="NAME.xsd">
<BookHdr>
<tag>value 1</tag>
<tag2>value 2</tag2>
</BookHdr>
<Payload>
<payloadTag>value 1</payloadTag>
<payloadTag2>value 2</payloadTag2>
</Payload>
</BookTransfer>
I tried xmlstarlet val Filename
and also xmllint --valid Filename
both returned invalid. However, if I split each XML into separate files then they are valid (Unfortunately splitting is not feasible).
I managed to validate XML files combined of multiple of other XML documents following the steps:
csplit
command to split XML documents from the combined filexmlstarlet
command and redirect its output to a log filerm
commandThe script:
#!/bin/bash
SOURCE_DIR="./src"
LOG_DIR="./log"
files=()
while IFS='' read -r -d ''
do
files+=("$REPLY")
done < <(find "$SOURCE_DIR" -maxdepth 1 -type f -iname "*.xml" -printf '%p\0' | sort -zn)
total="${#files[@]}"
echo "start validating $total files" > "$LOG_DIR/summary.log"
counter=0
for file in "${files[@]}"
do
((counter++))
# extract
csplit "$file" --prefix="$file" --suffix-format='_%03d.xml.txt' --keep-files --elide-empty-files '/<?xml/' '{*}' &>/dev/null
echo "$counter of $total working on $file"
echo "$counter of $total working on $file" >> "$LOG_DIR/summary.log"
# validate
xmlstarlet val "$SOURCE_DIR"/*.xml.txt >> "$LOG_DIR/summary.log"
# clean up
rm "{$SOURCE_DIR}"/*.xml.txt
done