HTML Semantics

HTML Semantics
5m 27s

When you write HTML, you should always try to choose the correct tags based on the situation. The word "semantic" means the meaning of words. In HTML, "semantics" means we can give our content more meaning with the correct semantic tags.

Keep in mind that the roles of HTML vs CSS:

HTML tags organize and describe the meaning of our content, while CSS "paints" the look of it.

To better understand the relationship between HTML and CSS, you should read about CSS Zen Garden. This website changed an entire generation of web developers for the better in regards to how we approached CSS in the early 2000's.

Tags Describe Content

So why do we need to describe our content from an HTML standpoint? Who cares what tag we use as long as the user sees the content correctly?

The point of describing your content with the correct semantic tags is so computers can understand the meaning of your content for the benefit of users.

Doing well formed and semantic HTML helps the computer understand our content better. For example, when Google or any other search engine "crawls" your website, their computers are going to your server and getting the HTML for your pages. Notice I say HTML and not DOM. When a computer makes a request to your server, they are the "client" and your server responds with HTML. In the case of the client being a browser, the HTML is parsed and turned into DOM. In the case of a search engine, the crawler gets the HTML to consume the content for their search index. When the crawler is looking at your HTML code, they're trying to figure out the "meaning" of it. Using semantic HTML tags can really help them out. In turn, they will give you better search rankings.

It's interesting when someone says:

Who cares what tag we use as long as the user sees the content correctly?

Does that type of statement consider what visually impaired people can or can't see? Because another type of computer (or software) that depends on good semantic markup is screen readers for accessibility. The web is for everyone, and it's our job as web developers to make it accessible to everyone. When we use the correct semantic tags, we're giving screen readers the tools they need to read the content out loud to the visually imparted. Accessibility is a big sub-field within web development. It's sometimes abbreviated as "a11y" because there are 11 characters between the "a" and "y". While knowing how to make your sites more accessible is of huge importance, this course has a lot to teach you regarding other fundamentals first. But keep a11y in the back of your mind always.

Also keep in mind that there have even been situations where lawsuits were filed against companies with inaccessible websites. The plaintiff wins because it's the company's responsibility to be fair to everyone. This is a rare thing though. You're not going to get sued because your blog is inaccessible but you might get criticism from other web developers. If you're working for a giant company, I suppose you could get sued, but this is still rare. The point though isn't to avoid the lawsuit, the point is to keep the web open and accessible to everyone because it's the right thing to do.


Now let's see how HTML tags can give our content meaning.

As people, we can look at or read content to derive meaning based on its position and other clues. Just the fact that we understand language semantics much better than computers do plays a big role in this. For example, can you tell me which part of this article is the title and which part(s) are paragraphs:

output

It's pretty easy for us to see the first sentence, "A brief history of the World Wide Web", is probably the title. It's at the top, it seems to describe something that's about to come next, and it's bold. While this is easy for us, computers can't really figure this out unless we help them.

The HTML for the above content looks like this, with a b (bold) tag and some br (break-line) tags for returns:

<b>A brief history of the World Wide Web.</b>
<br /><br />
In 1989, Tim Berners-Lee was working at a science research facility called CERN and wanted to create a digital medium for documentation. He eventually went on to create HTML, HTTP, and with some help from another researcher, the first Browser.
<br /><br />
Later, CSS and JavaScript were added to browsers for styles and programmatic functionality.

Even though the visual output might look good for those who are not visually impaired, the tag choices were bad in this case because these tags don't do a good job of describing the content. There's correct tags to use for headings and paragraphs which would have been better choices.

Headings use tags like h1 and h2 depending on the "level" you want. h1 makes the biggest heading and h2 makes the second biggest heading, etc. Headings are actually the only tag that has a numbering system like that. Paragraphs use the p tag. This next example makes use of these tags which are a better choice in our case:

output
<article>
<h1>A brief history of the World Wide Web.</h1>
<p>
In 1989, Tim Berners-Lee was working at a science research facility called CERN and wanted to create a digital medium for documentation. He eventually went on to create HTML, HTTP, and with some help from another researcher, the first Browser.
</p>
<p>
Later, CSS and JavaScript were added to browsers for styles and programmatic functionality.
</p>
<article>

The computer now knows that we have a heading followed by two paragraphs. The heading and paragraphs are all apart of the same article since the <article> tag is the container for everything.

In HTML, tags create boundaries with their opening and closing tags. Sometimes tags just wrap around text. Other times tags create boundaries around other tags to group things together.

We can see how the article tag can create such groupings. This is called "nesting" HTML when we see tags inside tags inside tags like this:

output
<article>
<h1>A brief history of the World Wide Web.</h1>
<p>In 1989...</p>
</article>
<article>
<h1>All about CSS</h1>
<p>CSS Was created to...</p>
</article>

Even though we don't visually see a box around each article, the use of two article tags allows us to describe to the computer that there are in fact two of them. If we tried to make this same output using only <br /> tags, we would have no way to group where one article ends and one begins. We would have no way to "describe" that some stuff is a paragraph and some stuff is not.

Whitespace

The above example indents the nested HTML inside of <article> so we can easily see what tags are inside other tags.

The term whitespace refers to characters that would not print (if you tried to print them) like spaces, tabs, hard-returns.

The main purpose of adding this extra whitespace in HTML to "decorate" it is to make it easier for us developers to read and see what's going on regarding the nesting.

Mostly, whitespace is ignored when rendering and we only use it so developers can read the code more easily.

When we say ignored, we mean that we could write our HTML like this with big spaces between tags and it doesn't effect what the rendered output will look like:

output
<p>
In 1989, Tim Berners-Lee was working at a science research facility called CERN and wanted to create a digital medium for documentation. He eventually went on to create HTML, HTTP, and with some help from another researcher, the first Browser.
</p>
<p>
Later, CSS and JavaScript were added to browsers for styles and programmatic functionality.
</p>

The fact that there is a visual space between the paragraphs in the output has to do with margin, a topic we'll cover in detail later on, and has nothing to do with the excessive whitespace in the HTML.

Why do we do whitespace for only ourselves (the developer)?

whitespace is optional to add, but If we didn't add it, HTML would be very difficult to read. Just to emphasize the point, look at all these examples where the same HTML is written with different whitespace and yet the rendered output would be the exact same:

<p>This is the first paragraph</p><p>This is the second paragraph</p>
<p>This is the first paragraph</p> <p>This is the second paragraph</p>
<p>This is the first paragraph</p>
<p>This is the second paragraph</p>
<p>This is the first paragraph</p>
<p>This is the second paragraph</p>

There are recommended ways to nest and organize your whitespace which we'll cover later, but just know that it is subjective as to how we do it, and it doesn't effect the rendered output.

Using the wrong tags

Let's go back to a previous example and use the wrong tags on purpose:

output
<p>
A brief history of the World Wide Web.
</p>
<footer>
In 1989, Tim Berners-Lee was working at a science research facility called CERN and wanted to create a digital medium for documentation. He eventually went on to create HTML, HTTP, and with some help from another researcher, the first Browser.
</footer>
<div>
Later, CSS and JavaScript were added to browsers for styles and programmatic functionality.
</div>
p {
font-weight: bold;
font-size: 2em;
margin: 0;
}
footer {
margin-top: 15px;
margin-bottom: 15px;
}

I'm using three tags to make my output and they're all bad choices for what I'm using them for. Then I wrote CSS to make the rendered output look good. You might not understand the CSS yet, but the main point here is to remember that HTML is our content and CSS is our design. CSS can stylize the "wrong tags" for our situation to look right.

Even though we did the wrong tags on purpose, some developers might choose the wrong tags on accident and then rely on CSS to make things look right. We just want you to be aware of this now -- that just because it looks right doesn't mean it is right.

What if there is no correct semantic tag?

Even though there are a variety of semantic tags available, sometimes you just need a DOM element box for styling and there's no suitable semantic one.

There are two tags that were designed to have no semantic meaning incase there is no obvious semantic tag to use. They are div and span.

When writing HTML, try to use the right tag, and if there is no tag but you need a "box" or "container" for what you're making, you can just use a div or a span.

There are two fundamental types of boxes that you'll learn more about later. They are "block" and "inline". All you need to know for now is that div is "block" by default and span is "inline" by default. The W3C just wanted to give us a non-semantic option that was either block or inline which is why we have two choices.

Sometimes you might see developers being critical of other web developers for over-using div tags. The criticism is usually because developers are not using the correct semantic tags and they're using div as their go-to for everything. Just know that for most websites, you will in fact need div more than any other single tag, so it's normal to have a lot of divs. Experienced developers know where the threshold is between using div appropriately and using it excessively. Over time, you'll get enough experience to know that threshold. In the mean time, ask a friend developer to review your HTML for these types of mistakes.

It's also important to know that there's pretty much never a single "right" way to do things in any web development topic. If you asked two or three professionals to build the same UI, I can almost guaranteed that they will make different decisions in how they build it, and it's also likely that they all did it a good way. That's because there is no one/true right way to do it.

What are the semantic tags?

There's roughly 100 semantic tags available. Over the course of your web development career, you'll probably discover and re-discover tags years down the road. Don't try to memorize all of them. I don't have them all memorized, and I can tell you with certainty that basically nobody has every single HTML and CSS feature memorized.

My father is a pretty good with construction and I've actually lived in a few houses that he built. He knows a lot but he doesn't know every detail about every construction tool ever made. He knows the main tools that you'd use 99% of the time for construction and when he needs to, he learns that one tool that gets used 1% of the time. That's exactly what it's like being a developer. The main idea is to learn the tags that you use most often. There might be 100 semantic tags but you'll find that you'll end up using about 15 of them for 95% of your work. Good web developers don't have it all memorized, they just know where to look things up when they need to for that last 5% of your work that you won't have memorized.

Here are some of the most common semantic HTML tags that you'll use regularly, so if you're looking to memorize things, this actually is a nice condensed list to start from. We'll also assume for the rest of all our courses and lessons that you know these tags.

<b>
For bold content.
<strong>
For content that should be presented in a strong way. How is this different from bold..
<i>
For italic content.
<em>
For content that should be presented in an emphasized way. How is this different from italic.
<u>
For underlined content.
<ul>
For unordered lists (bullet points).
<ol>
Similar to <ul>, but for ordered-lists (lists with numbers or roman numerals, etc.
<li>
For list-items of unordered lists or ordered lists. These tags should be directly within <ul> and <ol> tags.
<p>
For paragraph content.
<h1> to <h6>
For content headings. The numbers indicate the level of the heading. For example <h1> is the highest level for main headings, and <h2> are the sub headings of the main heading, etc. This is also the only tag that does a numbering sequence like this. See more about how to use these tags.
<main>
For the main content of the page. This is one of the few tags that's meant to be used just once per page.
<nav>
For navigation content.
<aside>
For content that is indirectly related to the main content. Often used for sidebars.
<article>
For article content.
<section>
For a section of content. See more about how to use this tag.
<form>
For creating a container for a form, with various input fields that can be submitted.

Bold vs Strong

The <b> tag is for making content bold, but without stressing it's importance too much. The <strong> tag makes content bold and does stress it's importance highly. As an example, if you really wanted to show the importance of some text and used the <b> and <u> together, you might want to switch to <strong> since the goal is not only to be bold but also to be strongly viewed.

Italic vs Emphasis

The <i> tag is for making content italic, but without extra stress on emphasizing it. For example, "The movie Jurassic Park was released in the 90s". In this case the italics show us the movie name but without extra stressing extra emphasis. Whereas the <em> tag is for when we do what to stress extra emphasis. For example, "We really need to see that move" stresses the word "really" as if someone speaking the sentence emphasized the word when they were speaking.

Links are called anchors for historical reasons because they would "anchor" you around to different places of the same page -- kind of like how clicking the "Why are links called anchors" above will anchor you here. In general, the term anchor means to scroll you up or down to different content of the same page.

If you'd like to know more, we have a short version of the history in our glossary on the anchor term.

How to use Heading Tags (h1-h6)

Be sure to understand what an HTML Outline is in order to understand how to use h1-h6 heading tags. It can be especially confusing as to how to use h1. For the original specs of HTML, h1 was meant to be used just once, since there should only be one main heading for a given page. Search engines like Google came later and put a high amount of importance on what you type in the h1 tag to rank the content on the page. Then about a decade after Google came into existence, HTML5 was released around 2010 and gave us section tags. In the HTML Specification, there's references to how <section> creates its own outline and so it is okay to use one h1 tag per section, but MDN does not recommend it.

Now you're just as confused about section and how to use h1 as any of us. Welcome to web development 😎

You'll probably want use <header> and <footer> for your primary header and footer of your website. But you can also use each of these numerous times per page, on a per outline basis. MDN has some content on this ability to use them numerous times.

header tags should not have other header or footer tags as their descendants in HTML. The same is true for footers not having header or footer tags as their descendants.

Section Tags

This might be one of the more confusing semantic HTML tags. It creates something called an outline in case you need that. The name might suggest it makes any "section" of your site, so some devs think it's a good alternative to using div, but the spec explicits states to not think of it that way.

Understand HTML Outlines and read more from the official docs at MDN about sections and outlines

See also How to use Heading tags (h1-h6) above.

Tag Pairings

Some tags are meant to be paired with others. Here are a few examples:

When you want to make a bullet-point list, you can use <ul> and <li> like this:

<ul>
<li>This makes a bullet point list</li>
<li>Where each li tag is a point (list item)</li>
<li>ul stands for "unordered list"</li>
</ul>

<li> tags are not meant to be used in HTML without being in a <ul> or <ol>. This is what we mean by "pairings".

You can also make an ordered list with the <ol> tag wrapped around <li> tags similarly to above. By default, ordered lists will use numbers, but they can also use other orderings like roman numerals for example if you adjust its type attribute.


The <select> tag makes a dropdown menu like this:

As its direct descendant, you use <option> tags like this for the options for the dropdown:

<select name="us-states">
<option value="al">Alabama</option>
<option value="ak">Alaska</option>
<option value="az">Arizona</option>
</select>

You can also use <details> and <summary> which creates expandable content:

<details>
<summary>What does CSS stand for?</summary>
<p>It stands for Cascading Style Sheets</p>
</details>

If you want to "describe" an image, you can do so with the <img> tag's alt attribute. But this is not visually displayed to the user. The alt tag on images is mostly so screen readers can know what the image is.

If you want to "caption" the content visually to users, you might want to use <figcaption> in addition to the image's alt attribute like this:

<figure>
<img src="me.jpg" alt="Me at the ocean">
<figcaption>A picture of me at the ocean</figcaption>
</figure>

You'll want to pair it with <figure> though. Notice how the nesting of the image and caption inside <figure> associates the image to the caption. Without <figure> wrapped around the image and caption, how would we associate the correct image and caption if there were many?

Related Content

Loading content