Search code examples
javaxmlstax

What exactly does `XMLStreamReader.getEncoding()` do?


XMLStreamReader's getCharacterEncodingScheme() returns the encoding attribute of the <?xml encoding="utf-8"> line. But if getCharacterEncodingScheme() is enough, then why is there getEncoding()? What exactly does XMLStreamReader.getEncoding() do?

The javadocs say hardly anything.

javax\xml\stream\XMLStreamReader.java

  /**
   * Return input encoding if known or null if unknown.
   * @return the encoding of this instance or null
   */
  public String getEncoding();

What's the purpose of getEncoding(), how does it differ from getCharacterEncodingScheme(), and how is the return value of getEncoding() determined with respects to the input XML?


Solution

  • There are a number of ways to create an XMLStreamReader. One of those ways takes the name of the encoding as a parameter ... rather than taking it from the input XML's explicit encoding attribute.

    So the purpose of getEncoding() is to return the actual encoding being used (if it is known) ... as distinct from the encoding that was declared in the <xml> element.

    Note: it is possible that the XMLStreamReader doesn't know what the encoding used when reading the XML. For example, when a XMLStreamReader is instantiated from a Reader, the decoder that the reader uses cannot be determined via the Reader API.