Lim Chee Aun

Whitespace and generated content

CSS is fun and I'm still learning it because in my opinion, it's a never-ending process. Like web pages, CSS is a growing entity. There are always more things to learn and understand. This time, I focus on CSS whitespace and generated content.

I first got interested in CSS whitespace handling when I saw these lines of code from Ian Hickson's site stylesheet, about two years ago till now:

pre.irc{
white-space: pre; /* CSS2 */
white-space: -moz-pre-wrap; /* Mozilla */
white-space: -hp-pre-wrap; /* HP printers */
white-space: -o-pre-wrap; /* Opera 7 */
white-space: -pre-wrap; /* Opera 4-6 */
white-space: pre-wrap; /* CSS 2.1 */
white-space: pre-line; /* CSS 3 (and 2.1 as well, actually) */
word-wrap: break-word; /* IE */
}

This looks confusing. Therefore, I asked Ian about this intricacy. He explained, in CSS2, there is no way to indicate that spaces and newlines should be preserved, but that if the text reaches the end of the containing block, it is okay to wrap it. The closest is white-space: pre, but that doesn't wrap. Before CSS2.1 is in Candidate Recommendation, user agents are not allowed to implement them, so they have implemented proprietary extensions. Ian lists all these possibilities, and because of CSS's forward compatibility guidelines, user agents pick the last one they support.

As for the last property, word-wrap: break-word is Internet Explorer's proprietary extension which is not part of any standards and will not be described here.

What is the difference between pre, pre-wrap and pre-line? How about normal and nowrap then? The more I read on the white-space property, the more I get confused with the meanings of the values and its processing model. Again, Ian helped me to simplify things up with a quick and basic table that shows the naming convention and relationships among the values:

Relationships for values of the white-space property
name spaces wrapping newlines
normal collapse wrap ignore
pre-line collapse wrap preserve
nowrap collapse don't ignore
(none) collapse don't preserve
(none) preserve wrap ignore
pre-wrap preserve wrap preserve
(none) preserve don't ignore
pre preserve don't preserve

To make things clear, pre-wrap acts like pre but wraps if necessary, while pre-line acts like normal but preserves newlines. I'm sure pre-wrap would be very useful for displaying long lines of computer codes that might overlap on other elements or goes off the screen. From the table, there are some which don't have names yet, due to repeatedly failed attempts to come up with better names. However in CSS3, each of the facets of these values can be individually controlled, as documented in the CSS3 Text Module under line breaking and text wrapping.

Okay, how about browser support? Most modern browsers now correctly support pre, normal and nowrap. Firefox supports -moz-pre-wrap but not pre-wrap and pre-line yet, reported as Bug 261081 and Bug 230555. Opera 8 supports pre-wrap including its previous extensions, -pre-wrap and -o-pre-wrap, but not pre-line. I guess, pre-line is much harder to be implemented?

Now, let's take this one step further. I start to fiddle with CSS content generation, specifically using the content property with the :before and :after pseudo-elements. Here are some of my experiments, starting with basic HTML codes:

<div title="some
title
text">text inside container</div>

Note that there are two line breaks, purposedly typed, in the value of the title attribute. Accompanied with a little CSS:

div[title]:after{
content: attr(title);
}

So, should the line breaks in the HTML source be generated via CSS? No. According to the HTML 4.01 specification, line breaks are not identified as white space characters and do not constitute line breaks in HTML.

Second experiment. The HTML codes:

<p>paragraph text</p>

This time, I include line breaks in the CSS source instead:

p:after{
content: "generated text
after the
paragraph text";
}

Two line breaks and nothing will be generated at all because the specification states that a string cannot directly contain a newline, unless the codes are modified to:

p:after{
content: "generated text\
after the\
paragraph text";
}

Of course, this is not an intended effect because the newlines escaped with a backslash will be ignored in the rendering. I read the specification again and found a way to include line breaks or newlines in strings:

To include a newline in a string, use an escape representing the line feed character in Unicode (U+000A), such as "\A" or "\\00000a". This character represents the generic notion of "newline" in CSS.

Another try, with escaped line feed characters:

p:after{
content: "generated text\A after the\A paragraph text";
}

Opera 8 renders the line feed characters but not Firefox. At first, I thought Firefox couldn't read this character yet but later, I found out that:

Authors may include newlines in the generated content by writing the "\A" escape sequence in one of the strings after the 'content' property. This inserted line break is still subject to the 'white-space' property.

with this example:

h1:before {
display: block;
text-align: center;
white-space: pre;
content: "chapter\A hoofdstuk\A chapitre"
}

From my understanding, the pre value preserves spaces and newlines, and doesn't wrap. Does this mean that it preserves the escaped line feed character the same way as newlines?

Yet another try:

p:after{
white-space: pre;
content: "generated text\A after the\A paragraph text";
}

It works on Firefox. I also tried another possibility:

p{
white-space: pre;
}

p:after{
content: "generated text\A after the\A paragraph text";
}

It works too, just as mentioned in the specification:

The :before and :after pseudo-elements inherit any inheritable properties from the element in the document tree to which they are attached.

Initially, before white-space: pre is applied, Firefox ignores the escaped line feed character because the :after pseudo-element inherits the white-space: normal style from the p tag. In another case, Opera 8 renders the escaped line feed character, even without white-space: pre, thus proves that the :after pseudo-element is actually applied with white-space: pre-line. Then I see, Opera 8 does support pre-line but only for generated content of pseudo-elements? Is this a wrong implementation?

Maybe. Maybe not, following an example from section 16.6 on whitespace:

The following examples show what whitespace behavior is expected from the PRE and P elements, the "nowrap" attribute in HTML, and in generated content.

pre { white-space: pre }
p { white-space: normal }
td[nowrap] { white-space: nowrap }
:before,:after { white-space: pre-line }

Interesting. The pre-line behaviour is expected in generated content, instead of inherited? If then, this means Opera 8 is correct. Partially correct, because it still renders the escaped newline under white-space: normal though:

p:after{
white-space: normal;
content: "generated text\A after the\A paragraph text";
}

As exciting as it gets, I take the very first HTML example above again, and apply these lines of CSS:

div[title]:after{
white-space: pre;
content: attr(title);
}

Opera 8 ignores the line breaks in the HTML source. Firefox renders them! Whoa. If I'm not mistaken, the returned string should not be parsed by the CSS processor. Yet another wrong implementation?

My experiments are done with the help of only two browsers, Mozilla Firefox 1.0+ and Opera 8, on Windows XP. I supposed anything rendered on Firefox should be the same with any Gecko-powered browsers such as Mozilla Suite and Netscape. Obviously, any versions of Internet Explorer, hopefully not 7 and above, are useless and fail all test cases here. Though I might be curious how my experiments would affect Safari and any other standards-compliant browsers.

I'm not sure if this has been discussed somewhere else and I could have missed some points, whatever. So, please correct me if I'm wrong. Overall, CSS is fun, right?

To make things more visually stimulating, I've prepared a testcase page which includes the above codes, for testing purposes.