Antidatalossconfigurationism

I’m in the middle of deep thought about what a text component looks like. There’s the easy stuff like having the ability to choose the appropriate semantic element. There’s the less obvious stuff, like adding a flag for screenreader only content. But then there’s something that we tend to overlook, how to manage overflow content.

Data loss

It’s not uncommon for someone to choose the solution of clipping the content, maybe in the form of an ellipsis. We even have more options coming about when those ellipsis appear. The problem with approaches that include overflow: hidden; is that the content is no longer clearly accessible. This goes against a core principle of mine: if you place it on the page, that means it is important. There should be no condition that obstructs that element in any way. Otherwise that can introduce data loss. Adding ellipsis is not an inclusive solution to large blocks of text. Instead, we’ll need to ensure the text remains readable while balancing the expectations of the layout.

Properties

There’s several CSS properties that are directly related to the way text flows on a webpage. We’ll go over each of the properties below but first a few definitions:

Whitespace refers to characters that provide space between other characters.
Segment breaks are characters that cause text to break onto new lines (eg., line feed).
Hard breaks (aka: forced line break) will always cause the text to break onto a new line, even if it is not necessary to do so.
Soft breaks will cause text to break to a new line only if necessary. Where these can occur are called soft wrap opportunities.
Line box describes the inline formatted box, typically used to contain text. The size of this box depends on the content and the available size of its ancestors.
CJK is short for Chinese/Japanese/Korean used to describe text that has a different ruleset for when strings are allowed to break.

`white-space`

This property has been around for a while, but recently it has been altered to host separate properties. Overall, this property and its related new properties are meant to handle how whitespace characters effect the surrounding text.

First, the white-space-collapse property has the following possible values:

collapse says that all whitespace should be collapsed. How whitespace is collapsed follows several algorithmic steps.
preserve says whitespace and segment break characters are perserved. This is commonly seen as the default in the <pre/> HTML element style.
preserve-breaks says only preserve segment breaks, whitespace characters are collapsed.
preserve-spaces does the opposite of preserve-breaks by allowing segment breaks to be collapsed and perserve whitespace as written in source.
break-spaces is the most unique of the values with a more specific set of rules for how whitespace is perserved.

Remember that white-space-collapse is only one part of the white-space property. The other part is text-wrap-mode which only has two values, wrap and nowrap. As you might expect, you’ll primarly want to have text-wrap-mode: wrap and this is the default.

When considering the keyword options provided by the original white-space property, it’s not immediately clear how they align to these values:

normal is white-space-collapse: collapse; text-wrap-mode: wrap.
pre is white-space-collapse: preserve; text-wrap-mode: nowrap.
pre-line is white-space-collapse: preserve-breaks; text-wrap-mode: wrap.
pre-wrap is white-space-collapse: preserve; text-wrap-mode: wrap.

`text-wrap`

To make things more confusing, text-wrap is also a shorthand. We already saw one of the properties text-wrap-mode which determines if the text is allowed to wrap or not. The newer part to this is text-wrap-style which has added many of the fancy algorithmic ways that text can wrap based on the number of characters. Here are the values for this property:

auto wraps the text in the most performant way and ignores the number of characters. This is the default.
balance attempts to wrap in away that keeps the number of characters roughly equal amongst the lines of text. This is best used for headlines.
pretty is similar to balance but favors layout over speed. This is best used for paragraphs of text.
stable is meant to be applied in areas where the user is editing content and wrapping should be kept to a minimum during this time.

Harry Roberts posted some benchmarks on using the newer text-wrap values. The setup was a page with 10,000 <p/> elements which, by default, have orphans. Here’s what he found:

Baseline – 1,186ms
text-wrap: balance: 1,224ms (38ms/3.2% longer)
text-wrap: pretty: 1,310ms (124ms/10.5% longer)

He said that impact on balance was almost zero, while pretty is more noticeable. The reality is that you most likely won’t have 10,000 paragraphs on your page.

The properties so far are only configuring how the whitespace characters behave in a block of text. None of these properties will affect the strings of characters that typically behave as words. The following properties are meant to target when these strings are meant to break and avoid poor layout artifacts.

`overflow-wrap`

The overflow-wrap property helps tell the browser when it should insert line breaks in otherwise unbreakable strings of text to prevent overflowing. This is an alias of the word-wrap property. Originally, this only allowed two values normal the default which keeps strings in tact and break-word which will break words sometimes. When a word is meant to break depends on the width of the container and where soft breaks can exist. This is usually very helpful for places where a long URL is written that needs to be broken to maintain a reasonable text layout.

The newer value option is anywhere which is typically less desirable as soft wrap characters are also used to determine how big the line box should be. This will allow the box to be much smaller instead of trying to fill the box as is normally expected.

`word-break`

Continuing with the theme of confusingly similar CSS properties, word-break is a property that determines what happens when a string would overflow outside of its container. To make things worse, one of the values here is an override to overflow-wrap. We’ll start with that one:

break-word is identical to overflow-wrap: anywhere and takes priority over any other value provided to overflow-wrap.
normal says to use normal line break rules which may also break between CJK characters.
break-all will cause a break to occur at the place where overflow would occur.
keep-all same as normal except CJK characters will also not break.
auto-phrase same as normal but analysizes the language to avoid breaking natural phrases.

A question you might have, what’s the difference between all of these properties? They all look like they handle the same sort of consideration for breaking strings of text. The CSSWG has an excellent breakdown of the differences:

The line-break property allows choosing various levels of “strictness” for line breaking restrictions.

The word-break property controls what types of letters are glommed together to form unbreakable “words”, causing CJK characters to behave like non-CJK text or vice versa.

The hyphens property controls whether automatic hyphenation is allowed to break words in scripts that hyphenate.

The overflow-wrap property allows the user-agent to take a break anywhere in otherwise-unbreakable strings that would otherwise overflow.

On top of the other properties that we didn’t cover, there’s a lot more detail within that page about how line breaks are expected to work. I’ve made a little playground to see how the properties would affect text samples combined from MDN examples:

Recommendations

Based on all of this, here’s how I’d try making an API for a text component when it comes to avoiding data loss. Here’s my considerations:

Create a wrap flag with the following values:
- (default) if the tag name used expects a certain text-wrap-style value, use that style. For example, balance for shorter text and pretty for longer text.
- true is the same as text-wrap-mode: wrap.
- false is the same as text-wrap-mode: nowrap.
Create a break flag with the following values:
- (default) sets overflow-wrap: break-word on text by default. This will allow the presence of long URL to break nicely when within the text. Josh Comeau includes this in his CSS reset.
- true is the same as word-break: break-all.
- false is the same as word-break: keep-all.

Here’s what the CSS might look like:

.text {
    overflow: visible !important;
    overflow-wrap: break-word;

    &:where(h1, h2, h3, h4, h5, h6, dt, legend, label, button, th) {
        text-wrap: balance;
    }

    &:where(p, li, figcaption, dd, summary, caption) {
        text-wrap: pretty;
    }

    &:where(textarea, [contenteditable]) {
        text-wrap: stable;
    }

    &:where([data-wrap="true"]):where(pre, kbd) {
        text-wrap: pre-wrap;
    }

    &:where([data-wrap="false"]) {
        text-wrap: nowrap;
    }

    &:where([data-break="true"]) {
        word-break: break-all;
    }

    &:where([data-break="false"]) {
        word-break: keep-all;
    }
}

I’m using the keyword value for text-wrap over the shorthand because it’s easier to update this value instead of unsetting it to reapply in some cases. This allows us to leverage the default of text-wrap-mode: wrap for most elements.

The elements I’ve chosen to receive text-wrap-style are based on their expected content. Elements that expect shorter amounts of content receive balance while elements that expect longer amounts use pretty. Note that these are targeting elements that would normally have content as their direct children. While you could put pretty on sectioning content for it to cascade, it’s more performant to be more specific.

If you were paying attention, you’ll notice that there’s a lot of property values that we went over which are missing from the API. The reason is that they’re probably not going to be useful in day-to-day web development. This isn’t meant to be a CSS reset, this is an opinionated setup for what I believe would be a text component that allows for some customizability while keeping the options limited.

This includes my own rule-breaking for use of !important in this case to call out that hiding overflow on text is a bad practice. Makes me want to set z-index: auto !important on all my elements too.