What is the difference between encoding and entity references in xml ?
Encoding refers to the way a character is represented by a sequence of bytes. It happens at a pretty low level in the processing chain: you read in the bytes and use the encoding to convert to a stream of characters. ASCII, Latin-1, and UTF-8 are all examples of encodings.
Entity references are handled by the XML parser itself. A sequence of characters, starting with &
and ending with ;
, is used to represent a different sequence of characters (usually just one). This happens at a fairly high level, conceptually "after" the XML parser has determined where tags are. This is why <
turns into a plain old less than sign, not the beginning of a tag.