What’s Best: Microformats, RDFa, or Micro Data?

In a recent post by Mike Blumenthal about Google’s announcement of supporting Microformats for local search, Andy Kuiper asked in the comments whether it would be best to go with Microdata versus RDFa or Microformat for marking up local business information. As the number of flavors of semantic markup have grown, I think Andy’s not the only one to wonder which markup protocol might be ideal. Here’s my opinion.

Microformats LogoWhen you’re asking “which is better?”, it’s important to know what we’re speaking-of, since there are a number of different goals that people could be pursuing. For some, this is a question of which is better from an elegance-of-coding perspective (if you’re interested in this, you might read Evan Prodromou’s great article, RDFa vs microformats). For yet others, the question should be focused on what’s best for their site — which solution is the simplest, most cost-effective to apply, and least likely to cause problems. Finally, the question could be seen from a perspective of what’s going to work best for the purposes of search marketing?

It’s this last orientation of the question that I’m focusing upon — which semantic protocol is going to work best for Search Engine Optimization (“SEO”)?

Now, you might think that since I was probably the earliest marketer to recommend using Microformats for SEO that I’d feel so “invested” in the protocol that I might push it exclusively. While I do enjoy a nostalgic thrill at having accurately predicted where things might go, I’m mostly dedicated to pushing approaches that will be most effective for my clients, and this requires frequent revisiting of techniques.

Microformats have been established the longest of the three protocols, and used by the search engines the longest. Google and Yahoo! both introduced hCard microformat on their own webpages by marking up local listings with it. This was later followed by Yahoo demonstrating that their crawling and code interpretation services were set up to parse both Microformat and RDFa data. So, the best-established and longest-running semantic protocol is Microformat, followed by RDFa.

Microformat’s initial advantage was that it worked seamlessly in existing HTML code, so using it within a page didn’t require any special tags that might overly restrict one’s version of HTML nor cause a page to be invalid code. The downside is that it primarily required using particular naming conventions of class attributes — retrofitting sites to have hCard required renaming of CSS classes or addition of more DIV/SPAN tags to add in the specially-named classes. Also, the marked-up parts of an address or whatever had to be nested properly for it to work.

Now, RDFa was built a bit more flexibly, since it was set up as more purely XHTML — and one key characteristic of XHTML is the “X” part — it’s “extensible”, meaning you can easily add namespaces for your own purposes without breaking a doc. So, RDFa added “property”, “role”, and “about” attributes for labeling information for machines, and an advantage is that you could introduce them without necessarily changing your CSS. The downside is that you technically need to have your document be well-formed to validate under XHTML — a more rigid document coding model than found under earlier versions of HTML. (Naturally, you could add the RDFa portions to your content without making the entire page XHTML-valid, but doing so is risky since it assumes that search engines will be able to properly parse the page when it’s not properly formed. While browsers and search engines successfully interpret invalid page code all the time, the risk to screwing up RDFa markup interpretation may be greater if the page is invalid, since the markup is dependent upon the machine properly recognizing the XHTML markup elements and interpreting them correctly.)

Microdata is the newest kid on the block, and is a format proposed by the W3C as a part of HTML5, and it appears to be heavily influenced by Microformats, adopting a number of the same label names found in hCard, hCalendar and other Microformats.

So, which is best for SEO purposes?

I should note that the use of semantic markup or structured data has never been solely about rankings. For local SEO, the primary advantage was about helping to insure that your data could be best interpreted by the search engines. Semantic markup helped inform search engines as to whether a data element as a street, a city, a business name, etcetera. For instance, if you were “Houston’s Restaurant”, located on “Tennessee Avenue” in the town of “Dallas, Florida” — it could be challenging for a local search engine to correctly associate your business with the right business listing and location. I made that example up, but there are plenty of real-world examples which result in some degree of difficulty in interpreting web content for local search.

For Local SEO, helping to insure your data was properly interpreted does provide ranking benefits — mainly for the cases where data is prone to misinterpretation.

In more recent years, the benefit for structured data has been more in the enhanced presentation of data to attract consumers, and less in rankings. Yahoo’s creation of enhanced search results listings with their SearchMonkey platform, followed later by Google’s introduction of Rich Snippets, allowed webmasters who used structured data to create slightly more attractive listings. The theory (supported by statistics reported by Yahoo) is that slighly jazzier search listings may increase click-through-rates.

Google SERP listing for Yelp with Rich Snippets

Although it’s controversial in SEO circles, I believe that there are indirect ranking benefits associated with increased CTR, too. So, it’s possible that improving the visibility of your pages’ listings through adoption of structured data may eventually result in better rankings as well.

While Google’s Webmaster Tools documentation states that they support all three major structured data formats — Microdata, Microformats and RDFa — there may be some advantage to sticking with the oldest format from the standpoint that it will likely have the best support and widest adoption for the moment. There are likely more systems out there beyond the search engines which recognize the format. From the search engine perspective, it’s also safest to use the protocol that likely has the most usage, since it may have the best support.

If you have a site that’s already well-formed for XHTML, and if you would have some significant difficulty in changing your CSS to use Microformatting, I’d suggest using RDFa.

Microdata is sexy because it’s embedded with the new HTML5, but I suggest waiting a bit to go full-fledged into HTML5 in general, from an SEO perspective. There are relatively quite few sites using HTML5, so I think there’s greater chance for misinterpretation of pages coded in it on the part of search engines. Using HTML5 at this point is a bit like being one of the first adoptees of a 1.0 version of Windows — there could be unforseen negative consequences, so it’s less risky to stick with the older protocol a bit longer until HTML5 is better established.

* One and only one caution about using any of the formats: Matt Cutts recently provided a Webmaster Help video addressing “How long does it take for rich snippets to appear?“. In that video he provides the caveat that Google looks dimly upon hidden content, so don’t make your structured data hidden or else they won’t make use of it.

Now Google’s Webmaster Tools states that they won’t make use of non-visible content in most cases, but there is some invisible content that they would use — most specifically, Longitude and Latitude values for geotagging. I think he’s giving the warning because I bet some webmasters have been lazy about trying to retrofit microformats/RDFa to a page without integrating it, and then rendering it invisible/hidden via CSS so they don’t have to change the page’s code much. This would be a no-no. But, it’s clear from the Microformat documentation and from Google’s Webmaster Tools instructions that it’s perfectly okay to provide longitude/latitude values or alternate formatting for dates via the nonvisible parameters already available in normal microformat code.

Postscript: Yesterday I pubbed an article on using Facebook’s Open Graph Protocol for SEO. I didn’t compare that in this post, because it’s a variant of RDFa. Also, it appears to me that one can use it in combination with the other semantic protocols for local search optimization — so, I’d generally recommend using it in combination with one of the protocols Google supports. In this way you can optimize for Google and Facebook simultaneously.

Share and Enjoy:
  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Yahoo! Buzz
  • Twitter
  • Google Bookmarks

8 thoughts on “What’s Best: Microformats, RDFa, or Micro Data?

  1. Good catch, Mike!

    I’d also note that they’ve only recently announced that they were using Microformats in some way within Maps, and they’re a little vague as to how they’re using it. It’s likely for the purpose of more accurately identifying local data, and perhaps for purposes of composing indices of citations around business locations.

  2. Very helpful article – answered several questions I had. Thanks very much for posting!
    I’m curious what your opinion is on the future of search engine support for semantic markup. The potential is there to clarify all sorts of relationships, e.g. “avalanche” the hockey team or “rose” the color, whereas right now SE’s appear to be interpreting a small number of explicitly recognized tags. If we are headed towards much broader support, it seems like RDFa is the best option for platform development, from a future-compatibility standpoint. Do you have an opinion on that? I know I’m asking for prognostication, but your track record so far is pretty good ;)

  3. I’d like to emphasise that these formats are not mutually exclusive. at least RDFa and microformats arn’t.

    You can mark up the same content using both formats and keep everyone happy.

  4. A missing element from mirco formatting could, or should, be the cid number from a business’s Google Places page. Wouldn’t it make sense to attach that element somewhere?

  5. Tony, you may be able to use both simultaneously, but I’m betting that most parsers looking for the semantic markup are going to try to identify one and use it — I think that trying to use multiple ones simultaneously could result in some undesireable effects.

    The new Schema.org protocol sounds very good, although I think it’s still a bit new to depend upon. I’ll probably do a new post commenting around old RDFa/Microformats versus the new Schema.org protocol. Stay tuned!

  6. Mark, that’s a good idea, but the CID or Category ID is not necessarily the same across all online directory services. There is an element for adding business category info within Microformats — you can see it described in my old original post on local SEO:

    http://www.naturalsearchblog.com/archives/2006/09/28/tips-for-local-search-engine-optimization-for-your-site/

    Again, the new Schema.org stands to change this, since they’re now doing special schema formats for different industry types. I would hold off and when Schema.org support hits critical mass, go with that.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Anti-spam image