Search code examples
xmlxml-validationwell-formedinfoset

What is an XML infoset and in what ways is it different to an XML document?


I've tried to read http://www.w3.org/TR/xml-infoset/ and the wikipedia entry. But frankly I'm still not sure what the difference is.

The quote :

An XML document has an information set if it is well-formed and satisfies the namespace constraints. There is no requirement for an XML document to be valid in order to have an information set.

From the wikipedia entry seems to not make sense. How can a non valid document have any semantics, and thus how can it be an 'information' set?

What is this 'infoset' that

well-formed and satisfies the namespace constrained

XML has? And in what way it is useful in itself. In other words why is it, semantically speaking, necessary to define the XML infoset? Is there any information that cannot be represented in XML? If so I can see the limiting set of the XML Infoset, but if not surely the XML Infoset is as meaningless as term 'information'?

Thank you for the interesting answers: I still cannot grasp why the Xml infoset has any purpose as opposed to the term infoset. But you guys have given me the direct answer to the question.


Solution

  • A useful way of thinking of the distinction between XML text and the XML infoset is to consider the Fast Infoset. This is a binary representation of the XML infoset.

    So you have the an abstract "infoset" which is a conceptual model representing XML data (nodes, elements, attributes, etc). This can be physically represented as a text XML document, or as a Fast Infoset stream. Both represent the same data, but in radically different ways.