Search code examples
javahtmlxmlxhtmlxom

How do I take off the XML version tag in the XOM library for Java?


I'm writing a small application in Java that uses XOM to output XHTML.

The problem is that XOM places the following tag before all the html:

<?xml version="1.0" encoding="UTF-8"?>

I've read their documentation, but I can't seem to find how to remove this tag. Thanks guys.

Edit: I'm outputting to a file using XOM's Serializer class

Follow up: If it is good practice to use the XML tag before the DOCTYPE, why don't any websites use it? Also, why does the W3C validator give me and error when it sees the XML tag? Here is the error:

Illegal processing instruction target (found xml)

Finally, if I were to put the XML tag before my DOCTYPE, does this mean I don't have to specify <meta charset="UTF-8" /> in my html header?


Solution

  • The tag is valid as XML and XHTML, and good practice. There should be no reason to remove it.

    Just leave it there ... or fix whatever it is that is expecting it not to be there.


    If you don't believe me, take a look at this excerpt from the XHTML 1.1 spec.

    "Example of an XHTML 1.1 document

     <?xml version="1.0" encoding="UTF-8"?>
     <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
         "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
     <html version="-//W3C//DTD XHTML 1.1//EN"
           xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.w3.org/1999/xhtml
                          http://www.w3.org/MarkUp/SCHEMA/xhtml11.xsd"
     >
       <head>
         <title>Virtual Library</title>
       </head>
       <body>
         <p>Moved to <a href="http://example.org/">example.org</a>.</p>
       </body>
     </html>
    

    Note that in this example, the XML declaration is included. An XML declaration like the one above is not required in all XML documents. XHTML document authors SHOULD use XML declarations in all their documents. XHTML document authors MUST use an XML declaration when the character encoding of the document is other than the default UTF-8 or UTF-16 and no encoding is specified by a higher-level protocol."


    By the way, the W3C validation service says that is OK ... but if there is any whitespace before the <?xml ...?> tag it complains.