IT Crossing
Saturday, July 05, 2008 | Register | Login 
Minimize
 IT Crossing Blog
May 2

Written by: Don Worthley
5/2/2008 8:42 AM

OK, here's how it it all started.  Rip Rowan, a developer contributing to the open source CMS project I'm been working with called DotNetNuke, got me thinking yesterday about some really interesting ideas related to assigning categories to the DotNetNuke blog module.  As many of you know, I'm now a member of the blog project team and we're in the process of thinking through some of the needed enhancements to the module.

Well, when I read Rip's post, I sent him an email saying, "Call me if you have the time to chat!"  A few minutes later the phone rings and Rip warns me, "OK, you've gone and done it now!"  Indeed.  It was a fascinating conversation that I would like to capture quickly in the blogosphere to bring other people's thoughts into the mix. 

Rip is proposing a new idea (new to me at least) for categorizing blog entries.  Basically, he's come up with an idea for an interface that would combine the concept of a category with the concept of tagging or keywords to produce one control and one underlying data structure in the database to manage the process of putting a handle on your blog entries.  Rip describes it like this:

...My idea is the use of a hierarchy delimiter.  The user can create hierarchies as deep as they like, and the auto-suggestion box helps them out.  So a blog post might be tagged up as follows:

--------------------
[MyBlogPost]
Tags: [Places\Dallas; Places\New York; Places\Paris; Foods\Steak; Foods\Fish; Foods\Pizza; Time of Day\Morning; Time of Day\Evening]
--------------------

The data-entry UI would automagically suggest each portion of the hierarchy at a time, so when the user types "Pl" the textbox responds "Places", and the user hits the delimiter key "\" to accept and begin entering the next hierarchy component:

--------------------
Tags: (type) Pl
Tags: (UI responds) Places
Tags: (type) Places\ (user enters delimiter key)
Tags: (UI presents) Places\[drop-down list of places in the places list, if list is < 10 entries]
Tags: (type) Places\Dal
Tags: (UI presents) Places\[drop-down list: Dalhart Dallas Dalton]
Tags: (user selects Dalhart)
Tags: (UI presents) Places\Dalhart; (adds delimiter so user can begin entering next tag)
--------------------

Here was my response to Rip on the phone.  The fact that the publishing industry has used two distinct metaphors for assigning meaning or placing handles on the content for years makes me think there may be something to the distinction between categories and tags that should be maintained.  Here's how Rip makes his point in a second post on this same subject.

I'm inclined to view "tags" and "categories" as just two facets of the same thing: knowledge hooks we apply to content to help us find it.  My gut tells me that these things tend to be viewed as two different beasts because of the way they've historically been implemented.

Rip goes on to make the point that tags are really just ad hoc attempts at capturing semantic meaning that should already be captured in some kind of categorical data structure.

Speaking of the kind of tags that get created, Rip says,

If you look at what sort of tags get created, it becomes plain that there are three sorts of tags:

  1. Spurious tags - tags that duplicate existing attributes and shouldn't be there in the first place, e.g. "Monty Python" tag associated with the movie "Monty Python and the Holy Grail" which has an artist of "Monty Python", or tags which just don't add practical value, like "Movies that Include the word 'swallow'".
  2. Tags that ought to be part of a data structure that just doesn't exist - e.g. "Graham Chapman" tag on "Holy Grail", which really ought to be part of a structure called "Actors" but isn't provided by Amazon.  Or "British Comedy" which really should have been a subset of the "Comedy" genre, but Amazon didn't provide this option.  Or "Arthurian Legends" - a tag that could easily have been a category in the "Subject Matter" hierarchy.  Or "Party" - a tag that could well belong in a "Mood" category.
  3. Tags that don't seem like they should be part of a data structure, YET, because not enough material has been tagged on this dimension to understand the dimension.  For example, your hypothetical post that you wanted to tag "suggestions", could easily have been a node in the "Article Type" category, along with "Ratings", "Reviews", "Comparisons", and "Recipes"

On one level, I agree with this.  Tags and categories are both attempts at overlaying semantic meaning onto the published content to aid in discovery and retrieval.  But, on another level, it seems we should be careful about dismissing the separation that has existed for years in the publishing world between the table of contents and the index?  We may lose something by combining these two.  The one provides a nice thirty thousand foot view of our content using broad, rigid structures that are created to capture carefully labeled buckets for the content we have or intend to publish.  The other (tagging) allows content providers to reach into the semantic world a little deeper and provider synonyms and words or phrases that may slice across a variety of the predefined categories.

In a third post, Rip links to another blog entry which highlights how keywords2 and categories are implemented in WordPress 2.3.  This really gets to the heart of the issue.  Should keywords/tags and categories be managed through the same table at the data layer?  I think this may be a good approach since allows for the best of both worlds.  Two distinct user interface components could be created to allow for both the hierarchical, more structured semantic meaning achieved through categories as well as the ad hoc, more finely grained meaning captured by tags.

Thoughts Regarding Implementing a Categorization Interface in the Blog Module

I like the interface that Rip has suggested.  I can see that for people with tall lists of categories, that finding the right category may be time consuming, especially if a node in the tree contains a large number of entries.  Of course, if I had to guess I would say that most people using the blog module for publishing will have relatively simple category hierarchies.  In this case, requiring that users select categories using a textbox with AJAX would actually be more time consuming.  Another possible metaphor for category selection, would be a treeview with checkboxes for easily assigning multiple categories.  For smaller categories, this would allow the user to assign categories with a few clicks.

What do you think?  That's what I'm really interested in knowing!  Even if you're not using a blog module, customers of any site today expect to see standard forms of categorization.  Should categories be separate from tags, or could a new interface be used which combines the both?  Or, maybe you have blog entry of your own that's been brewing in your mind!  Join the conversation and let us know what you think! 

2 While I love the data structure used in WordPress, I would love to see it extended to support some of the rich semantic meaning that people may want to apply down the road.  Of course, I don't want to violate the YAGNI principle, but it would be nice to allow keywords to be identified as synonyms of other keywords.

Tags:

2 comment(s) so far...

I think that duplicating UI for adding tags and categories to a blog post would get the best from both the strategies, but could become time consuming. Having one only powerful interface that handles the power of both tags and categories would be a great solution, although this may lead to some limitations on flexibility. We also have to remember one thing: blogging platforms are used by thousands of people who simply don't want to learn the difference between tags and categories and learn to use two different UIs.

By Dario Rossa on   5/2/2008 1:02 PM

Excellent points Dario. I think it's easy to lose sight of the end user, and yet, that's the most important factor in the overall success of any application. And I agree, the end user needs to have something that's easy to use and intuitive to learn, something that you don't even have to learn, it just guides you to the right use because of the simplicity of the design. Another thing I had forgotten in this whole conversation is that there will most likely be an infrastructure built into DotNetNuke in some future release of Cambrian (The next release of DotNetNuke, the first version of which is due out Q2 of 2008). That's something to consider too. Even with this feature coming in a future release of DotNetNuke, I think a public conversation about this feature is helpful.

By host on   5/3/2008 11:52 AM

Your name:
Your email:
(Optional) Email used only to show Gravatar.
Your website:
Comment:
Add Comment   Cancel