June 27, 2005

prime tags?

zzzzsteak08.jpg
Sig Rinde talks about tags. And he has crea­ted a new gizmo that allows you to play with his ideas:

Even with tags we easily become overwhel­med and would require some data-structure to find our way. Tech­no­rati follows 1.3 million tags now!
Every per­son on this pla­net has a tag; name or social secu­rity num­ber etc. 6.45 billion of them.
This expe­ri­ment:
Uses mul­ti­ple tag choi­ces to choose and find.
And so what?
Using mul­ti­ple tags, about 20 tags would cover the 1.3 million single-use tags at Tech­no­rati.
Using mul­ti­ple tags, about 33 tags could give a uni­que iden­tity to every per­son in the whole world.
(Quite a few years since I stu­died sta­tis­tics, believe I’m in the ball­park, but any­body out there who could corro­bo­rate?)
And 20 – 30 tags are less cum­ber­some to navi­gate than 1.3 million, or 6 billion!
Mul­ti­ple tags can replace any sin­gle tag, howe­ver uni­que that is.
You’re tag­ged with your name. That does not say much, does it? Unless I know you of course.
Now try mul­ti­ple tags. Add 10 tags, red hair, tall, birth­place etc. and you may be one of 153,000 with exact same tag set. Add yet another one that says more about you, say ‘Ita­lian speaking’ — voila, you got only 9,675 indi­vi­duals with the same tags. Add one more, now 634 iden­ti­cals. Add two more and ‘high­ligh­ting’ exactly those 14 tags gives one return; you.
Ditto for plants, ditto for file struc­ture on your com­pu­ter, goodbye fol­ders and search. Etc.
Add that a set of tags gives imme­diate (and com­plete!) infor­ma­tion about the object. Far beyond what a two dimen­sio­nal sys­tem may give (First and middle name, family name, does not give much infor­ma­tion that).
And that is what know­ledge is all about. Expand on that.
Time for a remake of Carl von Linn

25 Responses to “prime tags?”

  1. toby says:

    This is cute, and a very nice use of Javasc­ript and CSS. The real pro­blem, howe­ver, with a sim­ple boo­lean search like this is how do we deter­mine the 20 tags that can define any blog entry?
    Are there 20 or 30 key­words that, in com­bi­na­tion, could desc­ribe any book ever writ­ten? Don’t “folk­so­no­mies” exist because it’s so much easier to freely tag things than to fit them into pre­de­fi­ned buc­kets (whether it’s one buc­ket or mul­ti­ple buc­kets)?
    Just some thoughts… any ideas how one would go about deter­mi­ning this “mas­ter list”?

  2. hugh macleod says:

    Inte­res­ting ques­tion. We’ve all heard of prime num­bers. Is there such a thing as “prime tags”?
    And if so, what are they called, and how many of them are there?

  3. sig says:

    Toby, you hit the real issue there!
    And Hugh, another bulls eye, “prime tags” I like a lot! A bit like com­mon deno­mi­na­tors…
    And after all, why do peo­ple some­ti­mes get into hea­ted dis­cus­sions, fights, even wars — seman­tics, cul­tu­ral dif­fe­ren­ces in how we unders­tand words…
    Most tags or key­words could pro­bably be repre­sen­ted or repla­ced by another set of tags — perhaps such tags that in itself could shed more light on the “mea­ning” of the tag, key­word, word of the author?
    Carl Lin­naeus found that using family, form and so forth in the name would tell more about the plant than just “dan­de­lion” (the French calls it ‘piss en lit’ which tells a bit more :) . And that is what is called ‘know­ledge’. Rela­tionship bet­ween objects.
    In that sense the folk­so­no­mies does not for­ward know­ledge as such, even if it’s colour­ful, inte­res­ting etc.
    Lets keep this dis­cus­sion going, this I like a lot! :-D

  4. Prime Tags
    raving lunacy
    Mine All Mine!!!!!

  5. William says:

    It’s easy to visua­lize how you can create a limi­ted set of tags for a given set of things that can be clas­si­fied into a nice taxo­nomy (e.g. “Peo­ple” or “Books”). But can you really come up with a small set of tags that would clas­sify “Everything”?

  6. Shelley Noble says:

    Now, you’re tal­king. This is exactly what I was thin­king was nee­ded to clas­sify blogs/webcasts so that like min­ded could find each other. I knew you’d know.

  7. Jon Husband says:

    I sus­pect that prime tags are the tags that have more rather than less com­monly sha­red mea­ning in defi­ned areas of human acti­vity or know­ledge … like “ska­tes”, “sticks”, “rules”, “teams” etc. in hoc­key, or “yarn”, “stitches”, “pat­terns” for knit­ting, and so on.
    The con­fluence of struc­tu­red taxo­no­mies dan­cing back and forth with mis­ce­lla­neous author-designated tags that reflect the par­ti­ci­pa­tion and inte­rac­tion of peo­ple sha­ring and searching for mea­nings or per­ti­nent infor­ma­tion … now there’s something I will watch deve­lop with great inte­rest.
    There once was a paper I read tit­led “Coo­pe­ra­tive Clas­si­fi­ca­tion And Com­mu­ni­ca­tion Using Sha­red Meta­data” .. I believe the author was the per­son (aah .. the won­der of Goo­gle — Adam Mathes) who came up with the term folk­so­no­mies, or is one of the peo­ple in that tribe who follow and lead the deve­lop­ment of the unders­tan­ding of folksonomies.

  8. Jon Husband says:

    Yer blog doesn’t like html … here’s the URL of said paper
    http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html

  9. Forthcoming says:

    prime tags or iffy over­lap­ping tags

    Now we’re tal­king — Hugh coined

  10. hugh macleod says:

    Thanks, Jon. And let’s not for­get Clay Shirky: “Onto­logy is Ove­rra­ted: Cate­go­ries, Links, and Tags”.
    http://shirky.com/writings/ontology_overrated.html

  11. Alex says:

    Hugh your abi­lity to pick labels for ideas that are both pro­vo­ca­tive and mea­ning­ful is truly impres­sive.
    That is a real skill… and ‘Prime Tags’ is an exce­llent example.

  12. Jorn Barger says:

    Sean McGrath recently pos­ted on “seman­tic pri­mes” (click my name for the link).
    If you Goo­gle “frac­tal thic­ket” you’ll find my approach to brea­king down ‘com­po­site’ seman­tics into basic con­cepts like person-place-thing.
    A Flickr pic with two peo­ple and a pizza could be tag­ged “person1 person2 thing1” for starters…

  13. Ric says:

    That Clay Shirky article remin­ded me of the ‘path­way problem’ — where do you put a path­way across a park? If you build the path first, it’s like a hie­rarchi­cal cate­go­ri­sa­tion — in Clay’s words the ‘Yahoo’ approach. If you let the peo­ple use the park without path­ways, they will create them for you — more the ‘Goo­gle’ approach, or del.icio.us tags where the ‘path­ways’ to infor­ma­tion will be for­med by peo­ple tag­ging links.

  14. sig says:

    Ric, will follow your sug­ges­tion, Junior pro­mi­ses that you (and every­body else) shall be able to add tags to any post and com­ment (but not deduct of course) in the ‘expe­ri­ment’… in next ver­sion… soon (see geek defi­ni­tion of ‘soon’) :-)
    That’ll ena­ble the path­way deve­lop­ment nicely, perhaps…

  15. Sarah B says:

    There are an infi­nite num­ber of prime num­bers, of course, which is a reas­su­ring thought when making the ana­logy bet­ween tags for infor­ma­tion and these nume­ri­cal buil­ding blocks (that even though infor­ma­tion is cate­go­ri­sed, it is not limi­ted to finite categories).

  16. Wiliam says:

    The more I think about it, the less I think trying to limit the num­ber of tags is going to work. Wha­te­ver set you come up with, there will always that one situa­tion where yet-another-tag is going to be requi­red. I’m thin­king a bet­ter approach might be to create clas­ses of tags. (E.g. color: red-green-blue, genre: humor-drama-scifi, etc.) This would inc­rease the abi­lity to iden­tify like things as they would tend to use tags from the same set of clas­ses. For ins­tance, something refe­ren­cing a per­son would use tags that desc­ribe hair color (blond), eth­ni­city (His­pa­nic), and so on.

  17. hugh macleod says:

    William, nobody’s saying the num­ber of tags should be limi­ted to a cer­tain num­ber. Sig’s point was how sur­pri­singly few are nee­ded in order to handle large amounts of infor­ma­tion.
    And the tags will be crea­ted from the bottom-up, not the top-down, I’m guessing.

  18. Per­so­nal Ontologies

    Whew!

  19. Matt says:

    William’s idea of crea­ting clas­ses of tags might be called a tag­so­nomy. :-)
    There are tech­ni­ques for auto­ma­ti­cally glea­ning “con­cepts” from a tex­tual docu­ment that have been in use for quite some time by know­ledge mana­ge­ment soft­ware. The idea is to avoid requi­ring peo­ple to add their own tags. It has been a while since I follo­wed the mar­ket, so all of the com­pany names I used to know have disap­pea­red in the Great Pop­ping of the dot com Bub­ble, but Inte­lli­sophic has something like what I’m tal­king about:
    http://www.intellisophic.com/content.php
    Fas­ci­na­ting stuff.

  20. Jeff Zugale says:

    As far as I know, my name is a prime tag, because as far as I can tell, I’m the only per­son on earth who has my name, or has ever had it. My last name is pretty rare, I know almost ever­yone who has it in the US, and there are very few in Europe from what I can gather.
    So there can be sim­ple prime tags that are uni­que identifiers.

  21. toby says:

    It seems to me that an inte­res­ting solu­tion would be to take the way peo­ple are tag­ging things and algo­rith­mi­cally deter­mine the opti­mal set of tags.
    The only algo­rithms I’ve seen run on folk­so­no­mies so far have been sim­ple co-occurence metrics, which tell you “rela­ted” tags. I have a cou­ple of ideas of how indi­vi­dual tags can be aggre­ga­ted, which I’m trying out. I’ll keep you posted.

  22. hugh says:

    [NOTE TO SELF:] Stick to car­too­ning. This is so out of your league…

  23. Marty says:

    Lear­ning which pro­per­ties or tags act as good iden­ti­fiers is big in machine lear­ning, and has seve­ral algo­rithms. Fin­ding the ‘prime tags’ for some finite group of items is a mat­ter of fin­ding which tag most effec­ti­vely splits the ove­rall group into sma­ller sub­groups, recur­si­vely, until you are left with uni­que results. Deci­sion trees are a popu­lar type of machine lear­ning clas­si­fier and are an exce­llent exam­ple of this technique.

  24. More Gene­ric Tags Please

    Hugh has a post on gaping­void that desc­ri­bes a really good way to make tags more use­ful for brow­sing and searching for infor­ma­tion. Basi­cally, ins­tead of a sin­gle tag it is bet­ter to use mul­ti­ple gene­ric tags. I agree with what he is saying and want to

  25. I think that if you con­si­der a for­mal taxo­nomy, you’ll have your idea of prime tags. In fact, it’s one of the ways that Tim Berners-Lee inten­ded digi­tal mar­king (tags) to be used, I believe, when he wrote his ori­gi­nal article on the Seman­tic Web.
    An onto­logy is a super­set of a taxo­nomy (taxo­nomy is the hei­rarchi­cal, object orien­ted view of know­ledge cate­go­ries — your prime tags, espe­cially if you leave off seve­ral layers of leaf nodes). A taxo­nomy beco­mes a know­ledge repre­sen­ta­tion only when each level of it’s hei­rarchy is filled with many enu­me­ra­tions of data. If you just take the taxo­nomy (and as I sug­ges­ted, cut off seve­ral layers of leaf nodes), then you have basi­cally a nice tree struc­ture of prime tags.
    It could work, you know …
    Chuck