Semantic HTML for Usable Web Sites
Although the title of this blog post might seem a tad ironic in that there is not much semantics to HTML or XHTML, I do feel like writing a little piece about how to correctly use (X)HTML.
Hopefully, it might help you understand that even valid (X)HTML can be bad (X)HTML, and that — and how — semantic HTML may help improve your site’s usability, accessibility and search engine ranking. Whether you’re developing for a commercial site or not, these factors should be of interest. They certainly are to your end-users.
For this blog post, HTML means any variation of HTML or XHTML.
The Problem of Invalid and Bad HTML
Ever so often, I come across web sites that look pretty on the outside but make my Firefox HTML validator add-on goes nuts. This is a good indication that something needs fixing. The reason is simple - HTML code that does not validate towards the DTD used - be it HTML 1.0-4.0, XHTML 1.0 Transitional or XHTML 1.0 Strict - is:
- Not “real” (X)HTML (although HTML is strictly speaking not required to include a DTD-declaration, while XHTML is).
- Likely to break your design in various browsers you didn’t think of or couldn’t test against.
- More likely to raise accessibility concerns.
The first point is probably the least important one, as most web browsers will silently accept invalid HTML code (until it has to be re-designed due to a new browser release or changes in the browser market) and screen-readers will remain backwards compatible for some time to come. And apparently, search engine ranking does not seem to be that much affected by invalid code.
However, invalid HTML is likely to cause you pain through more cross-browser compability issues, than valid HTML would. Just as you don’t want to be the one stuck with maintaining badly written Java or PHP code, you don’t want to be the one re-doing your work due to poorly written HTML. In other words, you should code like a girl. Besides, for your web site to be accessible, the source code needs to be valid HTML. Therefore, you should always make sure your HTML validates.
You don’t want to stop at valid HTML, though. Valid (X)HTML is not guaranteed to be, i.e. does not equal, «good» HTML.
How To Create Valid and Semantic HTML
Just as I might write a syntactic correct English sentence (capitalize first letter and end it with a period) which might not make any sense (is meaningless), so too may I write valid, but «meaningless», HTML.
Syntactic correctness is not the same as semantic correctness.
Table-Less Layout
By now (2007) you’ve probably heard that table-less layout is the way to go. The effect of multi-column layouts should be achieved through styling (using CSS), not HTML tables.
The reason is that a web page is not a set of tabular data (at least most aren’t) and hence should not be marked as such.
Screen readers, search engines, etc. should be able to make basic “inferences” about the contents of the page based on the HTML elements used … therefore, you should make an effort to let them interpret the data as intended.
Lists
Apart from table-less layout, you might’ve heard that what is a list of items (logically) should be marked up as a list; don’t rely on CSS and meaningless classes and IDs alone.
Lists are not only easy to read and scan, which is good for usability, they do also “tell” clients (e.g. web browsers, search engines, screen readers, etc.) that “this is a list of information items”.
Even seemingly self-descriptive CSS class names bear no meaning except for grouping items into classes. A software application couldn’t care less whether a paragraph belongs to a class by the human friendly name of “list”, or the not-so-human-friendly-name “x-32fva98″.
When it comes to lists:
- Use unordered / bulleted lists (<ul>) where the list items have no numerical order of importance.
- Use ordered lists (<ol>) where the numerical order of the list items is of importance.
- Use definition lists (<dl>) where you want to convey information about a term, it’s definition and the relationship between the two (which is the term, and which is the definition?); for instance glossaries.
Don’t use list items for meaningless information, e.g. styling or making your design look better. You’ve got CSS for that … feel free to use dummy DIVs in order to make the web site’s design comply with the design requirements, if needed (which should be close to never, since you can style both the list as well as the individual list items using CSS).
Headings
Whenever you use headings to indicate logical hierarchies in text; don’t rely on CSS and different font-size.
If you want your site to be as accessible as possible: use HTMLs pre-defined headings to indicate different levels of headings!
Proper use and nesting of headings - only one H1, no H2 without a H1, no H3 without a H2, etc. - will
- Enable clients such as screen readers to insert proper pauses.
- Enable “alternative” browsers to display the information in a usable way.
- Help optimize your site for search engine inclusion.
Any kind of web browser, as well as search engines, can infer that text contained within H1 tags are of high importance and that text between H2 tags are important, yet not as important as the H1 heading, etc. Any company should be interested in ranking as high as possible on popular search engines (for relevant search terms of course).
Do not throw away this opportunity by indicating levels of headings through visual clues alone (you can of course always use CSS to style your headings, though).
Other Mark Up
Besides using table-less layout, lists for list items and proper use of headings, you should also use HTML’s <em> and <strong> elements whenever there is a logical emphasis or strong emphasis (as opposed to the meaningless <b> and <i>). Also try to use elements such as <abbr> for abbreviations, <acronym> for acronyms, <cite> for citations, <code> for code, etc.
Use labels to associate text label with its form input element(s), and of course follow basic usability & accessibility guidelines like meaningful alt-texts for images.
Again: you can use CSS for visual effects, but visual effects are only good for visualization - not for alternate views.
Conclusion
By little or no extra effort, you can make invalid and/or “un-semantic” web pages both valid, more accessible and more usable:
- Stick to table-less layouts.
- Use HTML headings — with proper nesting.
- Where called for: use the various kinds of lists.
- Use meaningful HTML elements (not just CSS) where possible.
- Use CSS for visual effects.