Search code examples
htmltoken

Difference between Name Attribute and Name Token? (HTML)


I was checking into whether I could use spaces safely in a name attribute and the overwhelming consensus (apart from one answer which the community said was incorrect) was that it is just fine, because the Name attribute uses CDATA token instead of a Name token, or something along those lines. The problem is I can't seem to find information on what the heck that actually means.

What is a name token? If the name attribute doesn't use one, what is the point? What exactly is a token, for that matter? I'm finding some information on the subject, but it all seems to be over my head.

Thanks!


Solution

  • Tokens can be roughly defined as a sequence of characters that represent something and follow a specific pattern. You might find this overview on Programming Language Syntax as a good start for basic definitions. The various types of supported tokens can be found at the w3 site and is listed as the following:

    CDATA is a sequence of characters from the document character set and may include character entities. User agents should interpret attribute values as follows: Replace character entities with characters, Ignore line feeds, Replace each carriage return or tab with a single space. User agents may ignore leading and trailing white space in CDATA attribute values (e.g., " myval " may be interpreted as "myval"). Authors should not declare attribute values with leading or trailing white space.

    For some HTML 4 attributes with CDATA attribute values, the specification imposes further constraints on the set of legal values for the attribute that may not be expressed by the DTD.

    Although the STYLE and SCRIPT elements use CDATA for their data model, for these elements, CDATA must be handled differently by user agents. Markup and entities must be treated as raw text and passed to the application as is. The first occurrence of the character sequence "

    ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods ("."). IDREF and IDREFS are references to ID tokens defined by other attributes. IDREF is a single token and IDREFS is a space-separated list of tokens. NUMBER tokens must contain at least one digit ([0-9]).

    Just because the token is called NAME doesn't necessarily mean it was meant for the name attribute. It's just an coincidence of like terms and the name attribute is a separate concept from the NAME SGML token. If you look at the table of Index of Attributes) you can see the types of tokens each attribute is expected to use.