Since I got a hang of separating structure, presentation and behaviour in web pages, I have been interested in semantics of HTML elements. It may be because of my academic background (I studied literature in university), but I just love marking up my web pages. Even when writing blog posts on my iPod Touch, I don’t mind squeezing in < and > signs and constantly overruling auto-correction of HTML tags. This may be the reason for writing this amount of text about an obscure, rarely used HTML element like cite.
The cite element in HTML 4
The HTML 4-specs are rather concise about the semantics of most elements. When discussing the semantically correct element to use, people people tend to look at the HTML 5 draft, which I think is generally a good idea. However, in some cases the differences between HTML 4 and HTML 5 cause confusion.
This is definitely the case for the cite element. Some people think you should use the cite-element to mark up quotations. This is clearly not the case, as the HTML 4-specs specify the q for quotations.
The HTML 4-specs defines the cite element as follows:
- CITE:
- Contains a citation or a reference to other sources.
I don’t think it was a wise choice to use the word citation
, as it has different meanings with subtle differences that are sometimes not easily translated in — for instance — Dutch. As a result, it’s no wonder that some Dutch developers use cite for marking up quotations.
The HTML 4-specs have two examples to illustrate the use of the cite element.
As
<CITE>Harry S. Truman</CITE>said,<Q lang="en-us">The buck stops here.</Q>More information can be found in
<CITE>[ISO-0000]</CITE>.
Both these examples suggest that the cite element should be used for marking up the source of a nearby quote. Opera Developer Community has a good description of cite:
The
citeelement is used to indicate where the nearby content comes from — when quoting a person, a book or other publication, or generally referring people to another source, that source should be wrapped in aciteelement.
In my opinion, this use of the cite element is semantically relevant, although I would have added a for attribute (cf. label and input/select), to connect the cite element to the right q element. Like so:
As <cite for="quote1">Harry S.
Truman</cite> said, <q lang="en-us" id="quote1">The buck stops here.</q>
Or something similar.
The cite element in HTML 5
The HTML 5 draft defines the cite element completely different:
The
citeelement represents the title of a work (e.g. a book, a paper, an essay, a poem, a score, a song, a script, a film, a TV show, a game, a sculpture, a painting, a theatre production, a play, an opera, a musical, an exhibition, etc). This can be a work that is being quoted or referenced in detail (i.e. a citation), or it can just be a work that is mentioned in passing.A person’s name is not the title of a work [...] and the element must therefore not be used to mark up people’s names. [...]
Well, something definitely changed. Nonetheless, the document describing the HTML 5 differences from HTML 4, doesn’t even mention the cite element.
Let’s have a look at the differences.
-
In HTML 4, the
citeelement - indicates where nearby content comes from;
- can mark up people.
-
In HTML 5, the
citeelement - does not need to have a relation with nearby content;
- cannot be used to mark up people.
In short, HTML 5 is not backward compatible with HTML 4 as far as the cite element is concerned.
Why these differences?
In fact, the cite element in HTML 5 is a semantic wrapper the titles books etc. Frankly, I can understand this, as the default styling of cite is italic in most browsers (I checked Firefox 3 and Safari), which probably caused the restriction from any source to just titles.1
The browser makers are wrong here. If they had defined the default styles in accordance with the HTML 4 specs, they would have let cite look like plain text. I mean, why would Harry Truman’s name have to be italic in the example above? In offline texts it wouldn’t, right? I guess the initial decision to make cite italic by default was made because someone just assumed that it was intended for book titles etc.
In HTML 4, we should use the i element for titles
Although I understand that HTML 5 is following this practice, I would suggest to use the i element for marking up titles. Anyway, that’s what I use in HTML 4:
<p><q>And if you want to give them something, give no more than alms, and let them beg for that!</q> (<cite>Friedrich Nietzsche, <i class="book">Thus spoke Zarathustra</i></cite>)</p>
Of course it’s recommended to add a semantically relevant class to the i element, book in this case. For marking up the title of a poem, I would use <i class="poem">, because in Dutch literary studies, the title of an individual poem is not displayed italic, but within single quotation marks. So, in my CSS I could use:
i.poem {
font-style:normal;
}
i.poem:before {
content:'‘';
}
i.poem:after {
content:'’';
}
Advise
Where does this leave us, if we want to use the cite element in a forward compatible way? ‘Forward compatible’ as in: easy to convert from HTML 4.01 to HTML 5. I mean, by the time HTML 5 is ready I want to be able to upgrade my entire site by changing my templates and not by changing the markup of the individual posts.
I would advise to just avoid any use of cite that is not in accordance with both the HTML 4 and the HTML 5-specs. So, do not use it (any more) for marking up people or book titles without a nearby quotation. This means, you cannot use cite as a wrapper for making titles italic. Use the i tag with semantic class names for that.2
The best solution however is that in HTML 5 cite will be defined differently. If there is a need for an element to mark up titles, think of a new element, so there is backward compatibility with HTML 4.01.
- After publishing this article I searched the whatwg@whatwg.org archives. Apparently, the default italic styling of
citedid have something to do with the changing semantics of the element.
Terug - I am working on implementing this consistently in this website.
Terug








2 reacties
Interesting article here. It poses one remaining question though: how *do* you mark up people’s names semantically in HTML5. This applies both to related content, but more generally to any name in a document.
Bodaniel,
If cite were the same in HTML 5 als in HTML 4.01, there would be no related content problem. But you’re right: in HTML 4.01 there’s no way of distinguishing between a person, book or whatever. We’d need a more general way of marking up people’s names for that.
<cite><person>Bodaniel</person></cite> said: <q>...</q>Eén trackback
[...] The Cite Element in HTML 4.01 and HTML5 [...]