I found out that browsers (I tested only Chrome behaviour) breaks line on some characters in words to prevent text overflow (in standard behaviour, thus: word-wrap: normal
). I don't think about breakable whitespace, but about these concrete Unicode characters:
So the questions...
Have a try:
<div style="width: 50px">
veryvery-veryvery-veryvery-veryvery
veryvery–veryvery–veryvery–veryvery
veryvery—veryvery—veryvery—veryvery
veryveryveryveryveryveryveryvery
veryvery−veryvery−veryvery−veryvery
veryvery+veryvery+veryvery+veryvery
long
</div>
Breaks in HTML/CSS text generally occur at "soft wrap opportunities", but the specific behaviour around which characters present such an opportunity is not standardised. Rather, the CSS specification defers to other text formatting specifications (e.g. language-specific guidelines).
However, a popular generic implementation is the Unicode Line Breaking Algorithm. The algorithm examines the Unicode properties of neighbouring characters with a set of rules to either create, force, or inhibit break points. It is not possible to come up with a complete list of individual characters that can create a break because the context that the character appears in is a relevant factor.