1. Nothing about Dublin Core? RDF? Atom? I understand this is a “primer”, but it would have been neat to work a bit of that in. Also, have you checked out the Calais system that Reuters is putting together? Advanced automatic m/data generation, basically.

  2. James Robertson

    Yes, should’ve made reference to Dublin Core, although at the basic level it boils down to just title, keywords and description (with perhaps a “DC.” at the beginning).

    RDF is certainly a powerful framework, but definitely falls into the “advanced metadata” category. I’m not sure how Atom/RSS fits in the topic of metadata.

    My aim in writing the article was to get web and intranet teams up to speed on the key topics, on the assumption that a lot of the behind-the-scenes details would be handled by the publishing tool (CMS, etc). As I indicated in the article, there’s lots of value to be gained in researching more advanced topics…

    In terms of automated classification, that’s a big topic! Like most things, where a significant investment can be put into the tools, good value is gained. But these are certainly not “install out of the box, and voila metadata!” solutions.

    Cheers, James

  3. Russ Weakley


    You mention two key aims of metadata:

    1. helping end users find what they are looking for, via search or navigation
    2. helping authors and administrators manage the site

    While it may seem a subset of your first point, I believe “helping applications/tools/bots find data” is important enough to stand on its own.

    This is becoming more important as online tools increase in number and complexity especially when used for mashing data (God i hate that word) from difference sources.

    • James Robertson

      @Russ, a late reply on your comment about providing metadata to help applications/bots to find data. Definitely an important aspect, but I’d highlight that concrete needs must be identified in advance.

      For example, in Australian government, a lot of metadata was collected against future plans to automatically create “portals” on specific topics. But these never eventuated, in part due to the patchy quality of the metadata itself. So that left gov agencies with the mandated requirement to collect masses of data, but for no clear purpose. (This has subsequently been made optional in most cases.)

      So automatic use of metadata is incredibly powerful, but only when done well (or at all!).

  4. Andrew Remely

    Nice article and I think you have got the tone & level spot on! Just a quick comment on controlled vocabularies and taxonomies…

    I agree with your point about ‘generic’ taxonomies tending to defeat the purpose hence the discussion of the merits of developing your own. However, you didn’t mention that are a vast number of existing specialist control vocabularies already in the public domain. For example sectors including; health, cultural heritage, and government all make extensive used of domain specific standards based controlled vocabularies. Adopting one of these may give an organisation the benefits of a highly refined controlled keyword list without the pain of developing and maintaining their own. For those people working large organisation have a chat to your librarian about what may already exist in your industry. Otherwise have a trawl the web!

  5. James Robertson

    Hi Andrew, agree on the value of pre-developed industry standards, these can often do 80% or more of the heavy lifting…

    One word of caution: there is a big difference between a taxonomy designed for classification, versus one designed for navigation.

    The most extreme example: the library of congress thesaurus is extraordinary at classifying pretty much anything, but totally hopeless for navigation.

    So double-check that the industry taxonomies will fit how you want to present information on the website. If so, you’re in business!

    Thanks, James

  6. Mike Chesser

    Nice job. Stumbled on your insightful writeup from Wikipedia Infodesign link.
    Follow-on discussion is thoughtful too.

    I can apply much directly to my current business problem of cleaning up a messy operations doc repository.

    Just what the doctor ordered!

  7. Kate Needham

    Great article, James. Another major reason we place a heavy emphasis on metadata on our intranet and websites is reuse (i.e. entered into metadata fields can be reused in multiple places on the site).

  8. Martin Bechtel

    Hi James,
    a very good introduction into the subject! A few years ago we developed a taxonomy for the keyword field of our intranet. As a starting point we used existing industry classifications, supplemented with terms from standard textbooks and finaly incorporated our company specific vocabulary. This mixed approach produced a quite sound taxonomy. Later on it also provided the core content to our company glossary on the intranet.

  9. James Robertson

    Hi Kate, metadata can be used in a variety of powerful ways, including driving content reuse and automated related links.

    Of course, a high level of discipline is required to make this successful. If the metadata isn’t consistently high-quality, then automated uses of it can break down, or generate some strange results.

    Well done on making this work, no wonder you have an award-winning intranet! :-)

  10. James Robertson

    Hi Martin, I love your step-by-step approach to developing a taxonomy, many organisations could learn from this! The use as a glossary is a nice end-user feature to deliver from the behind-the-scenes taxonomy…

  11. Brian F

    After working with some biggies, I’m *very* wary of taxonomies. The theory of using established taxonomies and everyone being able to share data in insightful ways is great, it just doesn’t work. To begin with, there’s the question of which taxonomy. I was once involved with a defence project that was investigating which of say 10 main taxonomies used in Australia and by our main allies could be usefully implemented, each offering several dozen top-level categories and up to tens of thousands of individual terms. The plain fact is that no-one does or is going to use these taxonomies comprehensively or correctly. Secondly, it’s impossible to know what users want from your data. You might think you’re publishing country profiles or travel guides, but what people are searching for is economic data or industry insights. Finally, there’s the sheer size and consequent management issues of these taxonomies. The reasons you can go to hospital fills a 1000-page book in 9-point type and keeps an office within the health department quite busy. AGIMO’s new master metadata plan fills an A3 page in maybe 8-point type without even beginning to address how it will relate to other industry-specific taxonomies relevant to specific departments. All of these initiatives are fantastic in theory but will never be used by more than a handful of people. Metadata can’t be generic enough.

  12. James Robertson

    Hi Brian, completely agree on the challenges inherent to taxonomies! These are hugely hard things to put into place, mostly due to the underlying organisational complexities.

    Of course, if an effective taxonomy can be deployed, the benefits gained are ten times the initial cost. But I agree, there are few organisations who have mature enough information management practices to allow this to happen.

    One good book to read on taxonomies has been written by Patrick Lambe in Singapore:

  13. Brian F

    Hmmm, maybe I need to get out more, but I haven’t seen a taxonomy that delivers much benefit. I’m happy with basic metadata as described above for basic site management but doubt we can even justify the effort put into AGLS metdata, given that it is only used by one search engine that generates maybe 1% of traffic. If metadata is only for internal use, it probably doesn’t need to be so complex, and we can simply accept that it only serves one purpose. At the CBR WSG meeting yesterday Stephen Zafira described tagging as ‘dynamic IA’, which I think is a step in the right direction towards accepting that data will be interpreted in different ways by different users, and frees us from trying to write ‘one taxonomy to rule them all’. KISS, live and let live.

  14. Amal Arunesh

    Great to know the basics of metadata. The article is very simple to understand.

    Could you tell me how to insert metadata in the intranet. Is XML the best way to integrate metadata

Published October 19, 2008

James Robertson
James Robertson is the Managing Director of Step Two, the global thought leaders on intranets, headquartered in Sydney, Australia. James is the author of the best-selling books Essential intranets, Designing intranets and What every intranet team should know. He has keynoted conferences around the globe. (Follow him on Twitter or find him on Google+)

Related Articles

From the store