Raul Valdes-Perez

Introducing Clustering 2.0

Vivisimo introduced high-quality text clustering into the search engine market in the year 2000, after a couple of years of computer science research on new algorithms by the founders at Carnegie Mellon. The research breakthrough was labelling the clusters, i.e, grouping search results into folder topics. Before that breakthrough, search result clusters had poor labels and so the technology was unusable. The technology was first demonstrated on a university website and later at vivisimo.com, with excellent reviews.

A couple of years later, Vivisimo’s computer scientists developed a way to add linguistic knowledge, to help detect similarity that the clustering algorithms would otherwise miss, and to prevent false similarities. For example, people’s language skills let them realize that kill, murder, slay, and gun down are pretty similar concepts, but make a killing is different (for you non-native speakers, make a killing is about making a large profit), and put the gun down is different too. This engineering breakthrough greatly improved the clustering performance with practical amounts of internal development, for English and other languages (try Japanese here or here).

An earlier blog post goes into more detail on the state of the art of this Clustering 1.0, as well as the end-user value.

Now on to Clustering 2.0!

Although clustering reveals the major topics in the top 200, 500, or more search results, there are always more topics than can be shown, without overloading the user with a very long list. There hasn’t been any better approach, until now.

With a single click, remix clustering answers the question: What other, subtler topics are there? It works by clustering again the same search results, but with an added input: ignore the topics that the user just saw. Typically, the user will then see new major topics that didn’t quite make the final cut at the last round, but may still be interesting.

Remix clustering was introduced in Vivisimo Velocity 6.0, our enterprise search platform which also introduced other user-experience capabilities.

To see remix clustering in action, try searching Clusty.com for our company’s hometown of Pittsburgh. Look at the folders, then click on Remix to the right of the top folder. Notice how you can dig deeper and “tour” Pittsburgh effortlessly, just by remixing. Or pick a topic that you are familiar with, and notice how repeated Remixing will turn up an interesting but unfamiliar topic or two.

To clarify: I’ve been asked whether remix clustering is only for when none of the folder topics looks interesting. Not at all! When I select a book off the bookshelf, it doesn’t mean that every prior book I saw was uninteresting. Instead, it just means that I want to see what else there is. Same thing here: What other topics are there?

The name of the game in search interfaces is to empower the user to see more, effortlessly, and avoid the curse of Information Overlook (pdf – an old thought piece). Clustering 2.0 plays the game very well.

By the way, an obligatory reminder: remix clustering is patent pending.

Tags: , , ,

Trackbacks & Pingbacks

What is a Trackback? What is a Pingback?
  1. ResourceShelf » Vivísimo Adds Clustering 2.0 to the Mix, More Clusters Visibile With a Single Click wrote:

    [...] Learn More Here [...]

  2. Using Clusty For Blog Content and Research : The Blog Herald wrote:

    [...] the recent announcement of Clustering 2.0, Vivisimo and Clusty have gone even farther: A couple of years later, Vivisimo’s computer [...]

  3. Internal Organizational Search « The Summary-Makers Insight wrote:

    [...] Here’s a blog you can reference http://searchdoneright.com/2008/01/introducing-clustering-2.0/ [...]

  4. Clustering Data: Generating Organization from the Ground Up - SpellboundBlog.com - spellbound by archival science and information technology in the digital age wrote:

    [...] right hand corner of the cluster list and you get a new list of clusters. In a blog post titled Introducing Clustering 2.0 Vivisimo CEO Raul Valdes-Perez explains what happens when you click remix: With a single click, [...]

  5. To take away » Blog Archive » Clustering Data: Generating Organization from the Ground Up wrote:

    [...] right hand corner of the cluster list and you get a new list of clusters. In a blog post titled Introducing Clustering 2.0 Vivisimo CEO Raul Valdes-Perez explains what happens when you click remix: With a single click, [...]

  6. Geekovation » Blog Archive » Personalized Search and Disambiguation - The answers to search engine relevance wrote:

    [...] have been grouped into one or more clusters (cars, photos, clubs, cat panthera, etc). Clusty uses a suite of text clustering technologies that allows it to identify and label clusters in real time. This is an example of post-search [...]

  7. Social Computing » Archive » Vivisimo, un nouveau leader du secteur Enterprise Search wrote:

    [...] par KMWorld L’introduction de nouveaux modules comme le Velocity Discovery Module et le clustering 2.0, Le recrutement du 100 ième employé, La signature de 7 contrats de partenariats technologiques [...]

  8. Finding information on the Web: 25 Metasearch Engines wrote:

    [...] other: metasearch with clustering (description of their clustering technology) [...]

Discussion

  1. xD wrote:

    cool

  2. East Texas wrote:

    Job well done 8)

  3. Marcelo wrote:

    Congratulations for this great project!

    Att,
    Marcelo

  4. riri wrote:

    I guess clusty has been remixed! :)

  5. j-j wrote:

    I like Clusty very much!
    Go on!

  6. Hans wrote:

    Clusty is great, I would suggest that someone try to publicize it better. I have been using it for 1.5 years and virtually never ‘google’ anymore. Citizens of Western nations need this tool for their search inquiries, I would encourage Clusty staff to create ways to get the word out.

  7. Inquisitive One wrote:

    Clusty, Clusty…. you are so sweet and fulsomely satisfying for hungry seekers. Beseech your friends! Clusty is fun, powerful, intriguing, and so very clever.

    Vivisimo folks…… you should offer stickers, bike and car size… “I utilize clusty.com in an exceptionally gratifying way” or words to that effect. I’d buy that for a dollar!

  8. Bharatesh P Jain wrote:

    I use clusty along with google always. particularly when i need categorized results. Nice Search engine.

  9. Suvendu Sahoo wrote:

    Nice job.

    Before remixing, clusty should ask for additional search input and then further re-cluster to closer search results.

  10. Shailini wrote:

    Great for narrowing down searches. I shall definitely be using Clusty often!
    Clusty ought to promote the site eg. bus-back ads, cabs, magazines and on other websites.

  11. BigDaddy wrote:

    This is really a big breakthru. Whenever I google it displays a googol of results all of which I am or ne1 lse will even try to check neway. I read about clusty 3 yrs back and the very first search I done with it, it totally impressed me that I got my answer in a second which otherwise I wud have to google 4 5 0r 10 mins lookin thru all the stuffs. Now the remix brings a resight into this new revolutionary search engine. XCLLENT CLUSTY. Hats off to ur work!

  12. coxy wrote:

    I’m quite confused as to what the remix feature does – I can’t see it doing anymore than showing more clustered topics.

    I like Clusty though. It’s a permanent fight between Clusty and Yahoo! for my Number 1 search engine with no clear winner.

  13. Govind Singh wrote:

    It sounds like the I’m feeling lucky like thing. Let me just check it out!!
    Great work with clusty btw…keep it up! :)

  14. Paul wrote:

    Clusty is cool.

  15. stone lobster wrote:

    Great strides on my favorite search engine. On a new browser I first change the “google” search box to point to Clusty, and I use “cluster” instead of “google” as a verb. (“I mean search with Clusty.com instead of Google,” I say. “It’s much better.”)

  16. K. Howard wrote:

    Only thing I miss is being able to refine results and search “within” original searcn term for more specific results.

  17. Web Developer wrote:

    Two years ago we were integrating open source search technology, with little success. Integrating the clustering algorithms of today, currently only viable in major statistics packages with web search is an impressive step.

  18. boxe wrote:

    Nice, another reason to keep using Clusty . . .

  19. Francois Chevalier wrote:

    have been pleased with vivisimo in the past.
    may try clusty.
    thanks
    FC

  20. C wrote:

    I was looking for a google alternative, and happened to find vivisimo.com. I have been using Clusty consistently and have not used google or the like really for anything in pretty close to two years now. Clusty searches information, not popular oppinion. Great Idea, Great work!

  21. darren wrote:

    clusty rules

  22. darren p wrote:

    i now use clusty instead of google

  23. Rob Young wrote:

    I’m surprised no one has mentioned Carrot2. Although the cluster is no where near as intelligent as that of Clusty it’s a good comparison and an excellent way of learning how clustering works.

  24. Cindy wrote:

    I am a die-hard Vivisimo/Clusty fan. I have been using it exclusively for approximately two years. No more Yahoo or Google for me. I have recommended it repeatedly to all the firms, companies and agencies that I do business with. Many of which have converted also. Kudos! I am so very pleased with your work and continued efforts.

  25. Philip Mugford wrote:

    I have used vivisimo for many years . to me, The new Clusty does not seem as nice as the original vivisimo.com. in particular each item had the ability to be displayed in seperate window so that when youe exit it you would always end up back with your Vivisimo original search but now when you exit everything is gone andyou have to start all over again and thus have to enter vivisimo.com to get it back.

    One feature I Liked was its ability to highlight your search words in all its search results. HOWEVER, I wish Vivisimo would retain this feature when I clicked on an item to display its contents that the highlighted search words would also be highlighted in the final result.

    for example. how often I have seen a vivisimo hit that looks exaclty what I was looking for but when I brought up that item the KEY words were no longer highlighted and I would spend hours in a large document trying to find those references in vein . It alwasy made me wonder why Vivisiomo would indicate the presence of those key words but I could not find them.

    so is it possible that this highighted KEY WORD feature could go down at least one more level,

    Also since your original vivisimo will soon no longer be available how can I adjust CLUSTY preferences to give me the ability to switch to a seperate window, so my original search results will still be there when I exit.

    Thanks – Phil

  26. raquel samper wrote:

    we are the jewish community center in murcia spain. we use vivisimo to find jews in spain in murcia . its amazing how you can re-present data and then its easily digested
    thanks
    rov tadot

  27. Brian R Leith wrote:

    I used Clusty once ~ it was enough for me. I immediately made it my home page and now use it exclusively. Clusty is a gem, it also comes before Google (rightly so, and alphabetically), in my “Favourites”. Why not more marketing? For a start the logo would make a great soft badage. Keep up the good work. Brian Leith

  28. Valkier wrote:

    Better than google, but clusty images still isnt as good as google images. weird. Plus, how much data does clusty collect, if any? that would be the nail in the coffin for google if im not being spied on anymore.

    all in all, great search engine.

  29. Claes wrote:

    Is it possible that our sites can use cluster.com as ose ?

  30. analogstuff wrote:

    I was using clusty search from about 2 years and i am impressed with its power though. Some times i got some good results for the search which even google did not show. Thanks for improving this search engine.

  31. kelly jenness wrote:

    I would like to get my website into Clusty – how can we do it?
    Is there a clusty bot or way to index it in your search engine.

  32. TastyBurp wrote:

    Very cool portal.
    Nice to see information clustered in a useful way.

    Now, if I could only get my PDA to work with me in the same way.
    (hint)

  33. satoshi wrote:

    This is a great tool, I have to admit.

    This makes it possible to find webs that cannot be found by google or yahoo, who may deliberately ignore some info.

    This shall be a step forward in the “No MS, no google” movement.

  34. AskApache wrote:

    Great informative article! I think clustering could be the coolest thing to hit search since google, thanks!

  35. Malter wrote:

    Best search engine. Clusty needs more webpages or say more number of search results though the first few results are great and satisfying. Still sometimes when i gavr generalized search word. Where it should should have to come with some most common results, but it wasn’t like that..So work on the most viewed or most generalized search and the site confronting should the popular one.

  36. betty down wrote:

    super search engine, ı like the this search engine.

  37. Bonc wrote:

    nice work, but was the example with kill, guns and murder really necessary?

    Bonc…

  38. aaron wrote:

    I’ve tried google, yahoo, live search, & even others like dogpile & ask but none compare to clusty. It is easy for researching & not confusing in anyway. Now its even better I don’t even know what to say.

  39. Murali wrote:

    Clusty tool bar for Firefox is one of the best tool I have ever seen. We are looking forward for this tool bar to have compatibility with future Firefox versions also.

  40. Dominic wrote:

    I’ve tried clusty and found that it may be suitable to use in one of the IT Projects of our University. Therefore, I want to know more about it and also the licensing model. I did leave my contact info in Vivisimo website, but so far still haven’t heard from your company. Could you kindly follow up for me? Thanks!

  41. alex wrote:

    if u want clusty to succeed u should make it as fast as google no matter where u r located in….. google works for most ppl because it’s regional customized

  42. Web Crawler wrote:

    First heard about Clusty @ bbc, featured in “Click onlin”. I like it. Best of luck for you guys.

  43. Niyaz wrote:

    Out with a great performance and ability :D .. The Toolbar for firefox is a great one ;)

  44. Chris wrote:

    Lovin the Clusty. I’m going to try to make it my default search engine. Just a matter of training my fingers now :) Keep up the good work!

  45. k a rahman wrote:

    super search engine, ı like clusty .
    thank you

  46. Searcher wrote:

    Discovered Clusty by accident whilst checking my log files – VERY impressed!

  47. Fernando wrote:

    Clusty in Spanish??? I would very much like that. I know Clusty through a friend of canada and I really liked the system that is much much faster to find what they are looking for and not anything else. I apologize because I am using a translator.

    Greetings

  48. Regina wrote:

    It’s my favorite search engine. Too bad it can’t be added to Firefox without also adding another cumbersome toolbar!

  49. holtje wrote:

    Regina wrote:

    It’s my favorite search engine. Too bad it can’t be added to Firefox without also adding another cumbersome toolbar!

    Not true!

    1. Go to clusty.com
    2. Note that your search bar (defaults to google) has a button/drop-down that is glowing.
    3. Click on the the button/drop-down.
    4. Select “Add Clusty” from the choices.

    Tada! Now you can even set as you default search engine in Firefox!

    Ciao!

  50. rreid wrote:

    Clusty, I love your search engine.

    It’s like Google was before they turned into sellouts.

  51. ayappan wrote:

    Awesome search engine….better than google.com
    I know vivisimo since 2001

  52. Oli wrote:

    good job guys. this is one of the best search enging outthere, good looking out with u guys i am protected. keep up the good work!!!

    Peace!!

  53. Jithin BK wrote:

    Good work…
    Awesome ans superb search engine…
    Clusty…
    I love it a lot!!!!

  54. Mario wrote:

    I did a review of various search engines a few years ago, and found clusty to be one of the best search engines I found. It has the best balance of usability and features out there. I generally find what I’m looking for in two or three clicks due to it’s categorization of search results. People can’t believe how fast I find things. In fact, it’s my homepage for all my computers. And no…I’m not a shill for the company. I just wanted to support clusty’s existence since I don’t pay for the service and I’ve been using it for years now.

  55. Soulinhell wrote:

    A class professor introduce the class to vivisimo some years ago during a computer lab session. The advantages were immediate and obvious-clustered results, no more piles of info scrawling across the monitor. great search engine and service many thanks.

    next step would be to build in a few advanced search paramiters like those you find in the high end library data base search programs.

    Again-many thanks.

  56. Boje wrote:

    its cool and well done

    thank u for all ur support

    rgds
    boje

  57. Rafiq wrote:

    Great project, congrats!

  58. alex wrote:

    why a fabric logo?

  59. nic wrote:

    im new in clusty but it’s cool :D
    i like this :D

  60. Moshe Yudkowsky wrote:

    Remix is very interesting, but your FAQ does not include any information about it.

    And there’s no explanation of how to return to the un-remixed original results — do I have to re-search, or is there a simpler way?

  61. kaitlyn wrote:

    i like clusty! i am doing a science fair project, and whenever the teacher lets us go on our own laptops, i go straight to clusty.com! it’s gangster… like me! -lol-

  62. Jason wrote:

    I thing that this a great site. It is nice to see that there is another website where we can find information for the classroom!

  63. Corey wrote:

    Good site. Since 2000 and I’ve never heard about it until today. Very cool concept.

  64. Edwin Escobar wrote:

    It’s geat. I could find the specific and exact information I needed. Not even google gives such a detailed information as cluster. Cool!

  65. Jithesh wrote:

    Clusty; just in simple words…its an intelligent search engine…

    Congrats to all technical team behind the scene…

    Thanks
    Jithesh.V

  66. joe wrote:

    Very Interesting. I would think its only a matter of time until we see more technologies like this incorporated into the top search engines.

Leave a Comment