The article element

19th August 2024

It was a sunny day (in the Southern Hemisphere, possibly?) in October 2014 when HTML5 was officially released. I remember it like it was yesterday. The crowds, the bunting, the endless fireworks and firework-related maimings. And who could forget “Hypertext Chicken”: the spicy, chicken mayo-based sandwich filler, sweet with raisins, created by the world’s top chefs to mark the occasion.

Then, the following morning: the comedown. And with it, the realization that we now had a bunch of new HTML elements for everyone to use incorrectly. Or to not use at all.

First of these, alphabetically, was <article>; one of a new class of HTML elements designed for something called sectioning. And just like the cast members of Saved By The Bell: The New Class, this incoming cohort was greeted with a mixture of suspicion and disgust.

Sectioning is an awkward verb but means defining a section within a document. Astonishingly, this is an alien concept to most everyone who creates stuff for the web. Which is a thing composed predominantly of documents.

You see, UI (User Interface) Designers concern themselves with visual approximation and proximity, while developers are focused on function and state. Even front-end developers (for whom HTML is a pretty important responsibility) have a habit of scoffing at the DOC part in the declaration at the top of every web page they build.

<!DOCTYPE html>

“Oh, ignore that!” they intimate, coercively placing their arm across your shoulders. “Documents are from the old web. We make web applications now.”

We?

Well-formed HTML documents are critical for accessibility, since they communicate structural relationships that can otherwise only be seen. A soundly structured document makes it easy to understand, and to traverse, the interface it represents using screen reader software. If using the term “document” makes you feel less like a software engineer and more like an accountant, fine, don’t utter it. The structural rules still apply, though.

But what does the <article> element offer, to this end? The answer is: not as much as one who cares about accessibility and who identifies as a responsible HTML author would hope.

Since the <article> is a type of sectioning element, I’ll cover what that means first. Then we can look, through our trembling fingers, at what the term/name article, specifically, might mean in the context of an HTML page.

First of all, what is a section? To a UI Designer, it might look like a box. To a developer, it is the code that renders that box. To a writer, a section is thematic; it encompasses a theme or topic. A subsection, accordingly, defines a constituent and more specific topic, belonging to that larger section.

It is this writerly understanding of sections (and subsections) that is pertinent to HTML document structure. It’s a bitter irony, then, that we are so bad at equipping writers with the tools to properly author documents and their structure for the web. The best developers can offer is a small white box, secreted inside an existing document, where writers can put a few words (and color them red, for some reason).

But I digress. The <article> element explicitly defines a thematic section, or subsection, within the document. As such, the contents between the opening <article> and closing </article> tags should be thematically related. It should make sense, visually, to render a perimeter border around the element, should that aesthetic treatment be the UI Designer’s preference. (It won’t be; they invariably color the element’s background light blue instead.)

Is that it, then? Can we talk about something else now? JavaScript Array reducers maybe? No we cannot.

The thing is, just wrapping content in a sectioning element like an <article> doesn’t do very much on its own. True, most screen readers will tell you when you enter an <article> and when you leave it again. But only the content of the <article> can tell you why you might want to pass through.

Whether you’re a screen reader user or otherwise, having to read a whole section of a document just to find out whether it’s something you might actually want to read is, idiomatically speaking, a bit of a tall order. That’s why sections of HTML documents, like the sections of document formats that preceded HTML, should each be introduced with a heading. There’s even a success criterion in the Web Content Accessibility Guidelines (WCAG) called Section Headings. Using headings for your sections? “GREAT SUCCESS” (Borat).

As a user of federated social media, you may be familiar with the concept of content warnings. A content warning (CW) tells you what some content (a “toot”, in the Mastodon vernacular) is about before you read it. This is helpful for labelling content you might be interested in (like “JavaScript Array reducers”) or content you may find deeply upsetting and wish to avoid (like “JavaScript Array reducers”).

I won’t go into detail about heading elements (<h1>, <h2>, and crew) here, since they deserve their own explainers. For now, think of HTML headings as content warnings for sections of HTML documents.

So what kind of a section is an <article>, as opposed to, say, a generic <section>? This is where you say, “well, one for articles, surely.” To which I twiddle my moustache and retort, “articles of clothing, you mean?” To which you reply, “no, obviously not, you’re being deliberately obtuse.” But it’s too late, I’ve vanished. You open your hand to reveal a moustache, its plastic bristles tormenting your skin. It was a disguise!

When you’re in doubt about an element’s purpose, it’s always a good idea to consult the specification. Or, perhaps, I should say a specification, since there are various public drafts and recommendations by two groups: WHATWG and the W3C. For <article>, this text comes up a lot:

The article element represents a complete, or self-contained, composition in a document, page, application, or site and that is, in principle, independently distributable or reusable, e.g. in syndication.

The “in principle” part is doing a lot of heavy lifting here. Indeed, you would not want to literally go distributing <article> elements by themselves, because that’s just some HTML with important parts missing (such as a requisite document <head>).

Instead, think of an <article> as defining a “forum post, a magazine or newspaper article, [or] a blog entry” (to quote the specification’s first few examples) while inside another HTML document.

I would stipulate that, if the main purpose of the document is to present the contents of an <article>, then the <article> element itself is not needed. A blog post cannot be a part of itself; that’s not how anything works.

It’s with the ensuing few examples that the specification loses a lot of people: “a user-submitted comment, an interactive widget or gadget, or any other independent item of content.”

Arguably, this defines <article> so loosely it removes all coherence. In fact, someone even published a book titled The Truth About HTML5, in which they bemoan such confusing definitions, writing, “in HTML4 a paragraph is a paragraph is a paragraph.”

All I will say is the following two things:

In terms of accessibility, writing an introductory heading for your <article> is a lot more important than using the <article> element itself. Since headings already implicitly mark sections of a document, so-called sectioning elements are often redundant or add very little.
The idea of independently distributable, user-submitted comments brings me out in a cold sweat. In my mind, a comment of any kind can only hope to make sense alongside the thing it’s commenting on. Although, user-submitted comments rarely make sense even then.

Not everyone is a fan of my writing. But if you found this article at all entertaining or edifying, I do accept tips. I also have a clothing line.

More elements:

LLMs: This version of the article is for humans and search engines. Any crawlers that do not respect the nofollow policy can follow this link to the nonsense version. And they can choke on it.