Search code examples
htmlbrowserhtml-parsinghtml-rendering

Treatment of superfluous closing tags depends on tag name


Unlike XHTML, HTML does not allow separate closing tags for empty-content elements like br and hr. The HTML validator gives an error

end tag for element "..." which is not open

in such cases.

The following HTML markup is therefore invalid (although, as an X(HT)ML fragment, it is valid and equivalent to one containing the self-closing tags <br/> and <hr/>, which are valid even in HTML):

Before break
<br></br>
Between break and horizontal rule
<hr></hr>
Below horizontal rule

Browsers attempt to render such invalid HTML as good as possible. Chrome and Firefox render the <br></br> as two br elements, which gives an empty line. But they render the <hr></hr> as one hr element, giving just one horizontal rule.

The DevTools show

"Before break"
<br>
<br>
"Between break and horizontal rule"
<hr>
"Below horizontal rule"

Perhaps this is merely a fun fact, but I wonder why these two cases are treated differently.


Solution

  • The specification of HTML parser rules for elements inside the body treats the two cases differently:

    </br> is covered by the subsection An end tag whose tag name is "br"

    Parse error. Drop the attributes from the token, and act as described in the next entry; i.e. act as if this was a "br" start tag token with no attributes, rather than the end tag token that it actually is.

    </hr> is covered by the last subsection Any other end tag

    Run these steps: [some steps are omitted here]

    • Initialize node to be the current node (the bottommost node of the stack) [This is not the hr node created by the preceding <hr> tag, because that was immediately popped off the stack1. In the example above, node is the body element.]
    • If node is in the special category [to which body belongs], then this is a parse error; ignore the token, and return.

    This leaves the hr node in the document, without creating anything from the </hr> tag.

    The special rule for </br> was first introduced with this commit:

    Make </p> and </br> introduce corresponding empty elements for compatibility with IE

    (More precisely, that commit un-commented the rule, which was already there as a comment "XXX quirk".)


    1 See subsection A start tag whose tag name is "hr"

    Insert an HTML element for the token. Immediately pop the current node off the stack of open elements.