The past, present and future of tags.

To illustrate how badly I digress while writing… this entry started off with the title “I hate social networks”. Go figure.

In some ways I am mildly obsessive-compulsive about organizing stuff. Which means that I’ve sorted things in boxes, categories, hierarchies, systems. It always breaks.

I buy paperback books because then I know they all fit in the same shelves (maybe also because they are cheaper), and I try to get all books in a series from the same printing (or at least the same publisher) so the cover art/bookends fit nicely. I can sort and pack stuff effectively – until I end up with all the small, irregular stuff that can’t be stacked, doesn’t have enough similar items to be packed together.

When moving I spend 90% of the time on the last 10% of stuff. Excluding cleaning… that too takes 90% of the time… procrastinating away from. Perhaps I spend so much time sorting and packing the last 10% of stuff so I can avoid the cleaning for a bit more?

Back to digital sorting and stacking.

I’ve tried partitioning hard disks, building hierarchies of folders, naming systems on documents, lists and whatnot. For blogs I’ve tried hierarchical categorizing (until multiple root folders contains like named leaves), no categorizing (teh horror) and lots of categories (tags without the actual usability).

Anyway, with the advancement of tags I feel I am getting closer to something usable. Broad categories where each item is part of one, and lots of tags attached to each item. Only three problems remain.

  1. The categories are always wrong
  2. There are always tags missing on the items
  3. Tags ends up nearly duplicated, so that they doesn’t tie together like items

Now, the last gripe might be fixed by regularly maintaining the tags, and having a system for defining synonyms. (a bit of work, easier with the right “tool”).

Missing tags are fixed by regularly maintaining tags and items, and can easily evolve into a monster. It can be handled if older items grow towards a suitable set of tags over (a not too long) time. One action that have started to be included in solutions for the Internet, is the ability for the community to help you tag your items. This lets items (images, blog entries, links) mature in their descriptions faster if your community is involved. If you let them. I.e.Google lets you play a game and help them tag images.

The first problem listed above is fixed by the impossibly hard problem: Few and general (broad) categories. Yes, that easy – and still hard to do.

I really want to look into Topic Maps. But that seems like too much work to maintain. Although the possibilities are alluring.


Identity, trust and OpenID

I recently wrote about OpenID on my journal, and I left kind of an overwhelming positive attitude simmering around the post. I still think you should read it or something authoritative on what OpenID is if you don’t know what it is. No, I don’t think OpenID is a silver bullet that will cure all identity and trust evilness on the web. OpenID itself once was presented as being about identity, not trust on the grounds of trust requiring identity. I don’t think OpenID goes very far towards machine readable identity, but it does go towards human readable identity.

Identifying as an URL instead of with a nickname and a password isn’t that much of an improvement. It proves you have access to some kind of an account giving you an ability to deploy web pages, and a pipeline to an OpenID server that accept you as a user. Since you can host your own OpenID server and web space is cheap (as in free) – this isn’t very comforting. Any service allowing users to authenticate comments with nothing else than an OpenID will find themselves swarmed with spam in no time.

This is known, and will be handled by making use of conventional spam filtering, captcha’s, requirement of creating an account (by OpenID identification) instead of anonymous submission (with OpenID signatures) and other means.

In addition someone will try (or have, what do I know?) white- and/or blacklisting. This might or might not totally destroy what OpenID is all about. As I see it. Blacklisting will probably work, but doing it will be walking on the edge. Blacklisting each individual account will be wasted resources. Each account is just one URL, and spammers can generate one for each comment and never run dry, even if they change the actual account. Delegation would probably be used, so you would need to keep lists on the endpoints as well. Blacklisting domains could be done, but it would require a bit of finesse or human intervention. Spammers manage to get hold of legitimate accounts, or accounts on legitimate hosts… one wrong step and you’d blocked a legitimate, and possibly popular, provider.

White listing would be impossible to do without wrecking it all. One of the cornerstones of OpenID is that you can set up your own provider, but if it was blocked by all from the start… Ok, that wouldn’t do, would it.

Where OpenID will, and do, work is between humans. If I write something, say in a WordPress blog… Incidentally I do… and start signing (in and by) as the URL of this blog, then people will know that I wrote the [whatever] I signed, and they can look up where I keep my identity and see what I am about. And what I write. That I am (most likely) a human being. This will lead to trust.

Only thing missing is spammers making sure their comments seem genuine, and lead readers on a click-through chase to a spam/ad page by way of the OpenID URL… Oh, well. Hope they don’t get that idea from me.