Raul Valdes-Perez

Introducing Clustering 2.0

Vivisimo introduced high-quality text clustering into the search engine market in the year 2000, after a couple of years of computer science research on new algorithms by the founders at Carnegie Mellon. The research breakthrough was labelling the clusters, i.e, grouping search results into folder topics. Before that breakthrough, search result clusters had poor labels and so the technology was unusable. The technology was first demonstrated on a university website and later at vivisimo.com, with excellent reviews.

A couple of years later, Vivisimo’s computer scientists developed a way to add linguistic knowledge, to help detect similarity that the clustering algorithms would otherwise miss, and to prevent false similarities. For example, people’s language skills let them realize that kill, murder, slay, and gun down are pretty similar concepts, but make a killing is different (for you non-native speakers, make a killing is about making a large profit), and put the gun down is different too. This engineering breakthrough greatly improved the clustering performance with practical amounts of internal development, for English and other languages (try Japanese here or here).

An earlier blog post goes into more detail on the state of the art of this Clustering 1.0, as well as the end-user value.

Now on to Clustering 2.0!

Although clustering reveals the major topics in the top 200, 500, or more search results, there are always more topics than can be shown, without overloading the user with a very long list. There hasn’t been any better approach, until now.

With a single click, remix clustering answers the question: What other, subtler topics are there? It works by clustering again the same search results, but with an added input: ignore the topics that the user just saw. Typically, the user will then see new major topics that didn’t quite make the final cut at the last round, but may still be interesting.

Remix clustering was introduced in Vivisimo Velocity 6.0, our enterprise search platform which also introduced other user-experience capabilities.

To see remix clustering in action, try searching Clusty.com for our company’s hometown of Pittsburgh. Look at the folders, then click on Remix to the right of the top folder. Notice how you can dig deeper and “tour” Pittsburgh effortlessly, just by remixing. Or pick a topic that you are familiar with, and notice how repeated Remixing will turn up an interesting but unfamiliar topic or two.

To clarify: I’ve been asked whether remix clustering is only for when none of the folder topics looks interesting. Not at all! When I select a book off the bookshelf, it doesn’t mean that every prior book I saw was uninteresting. Instead, it just means that I want to see what else there is. Same thing here: What other topics are there?

The name of the game in search interfaces is to empower the user to see more, effortlessly, and avoid the curse of Information Overlook (pdf - an old thought piece). Clustering 2.0 plays the game very well.

By the way, an obligatory reminder: remix clustering is patent pending.

Tags: , , ,

Trackbacks & Pingbacks

What is a Trackback? What is a Pingback?
  1. ResourceShelf » Vivísimo Adds Clustering 2.0 to the Mix, More Clusters Visibile With a Single Click wrote:

    […] Learn More Here […]

  2. Using Clusty For Blog Content and Research : The Blog Herald wrote:

    […] the recent announcement of Clustering 2.0, Vivisimo and Clusty have gone even farther: A couple of years later, Vivisimo’s computer […]

  3. Internal Organizational Search « The Summary-Makers Insight wrote:

    […] Here’s a blog you can reference http://searchdoneright.com/2008/01/introducing-clustering-2.0/ […]

  4. Clustering Data: Generating Organization from the Ground Up - SpellboundBlog.com - spellbound by archival science and information technology in the digital age wrote:

    […] right hand corner of the cluster list and you get a new list of clusters. In a blog post titled Introducing Clustering 2.0 Vivisimo CEO Raul Valdes-Perez explains what happens when you click remix: With a single click, […]

  5. To take away » Blog Archive » Clustering Data: Generating Organization from the Ground Up wrote:

    […] right hand corner of the cluster list and you get a new list of clusters. In a blog post titled Introducing Clustering 2.0 Vivisimo CEO Raul Valdes-Perez explains what happens when you click remix: With a single click, […]

Discussion

  1. xD wrote:

    cool

  2. East Texas wrote:

    Job well done 8)

  3. Marcelo wrote:

    Congratulations for this great project!

    Att,
    Marcelo

  4. riri wrote:

    I guess clusty has been remixed! :)

  5. j-j wrote:

    I like Clusty very much!
    Go on!

  6. Hans wrote:

    Clusty is great, I would suggest that someone try to publicize it better. I have been using it for 1.5 years and virtually never ‘google’ anymore. Citizens of Western nations need this tool for their search inquiries, I would encourage Clusty staff to create ways to get the word out.

  7. Inquisitive One wrote:

    Clusty, Clusty…. you are so sweet and fulsomely satisfying for hungry seekers. Beseech your friends! Clusty is fun, powerful, intriguing, and so very clever.

    Vivisimo folks…… you should offer stickers, bike and car size… “I utilize clusty.com in an exceptionally gratifying way” or words to that effect. I’d buy that for a dollar!

  8. Bharatesh P Jain wrote:

    I use clusty along with google always. particularly when i need categorized results. Nice Search engine.

  9. Suvendu Sahoo wrote:

    Nice job.

    Before remixing, clusty should ask for additional search input and then further re-cluster to closer search results.

  10. Shailini wrote:

    Great for narrowing down searches. I shall definitely be using Clusty often!
    Clusty ought to promote the site eg. bus-back ads, cabs, magazines and on other websites.

  11. BigDaddy wrote:

    This is really a big breakthru. Whenever I google it displays a googol of results all of which I am or ne1 lse will even try to check neway. I read about clusty 3 yrs back and the very first search I done with it, it totally impressed me that I got my answer in a second which otherwise I wud have to google 4 5 0r 10 mins lookin thru all the stuffs. Now the remix brings a resight into this new revolutionary search engine. XCLLENT CLUSTY. Hats off to ur work!

  12. coxy wrote:

    I’m quite confused as to what the remix feature does - I can’t see it doing anymore than showing more clustered topics.

    I like Clusty though. It’s a permanent fight between Clusty and Yahoo! for my Number 1 search engine with no clear winner.

  13. Govind Singh wrote:

    It sounds like the I’m feeling lucky like thing. Let me just check it out!!
    Great work with clusty btw…keep it up! :)

  14. Paul wrote:

    Clusty is cool.

  15. stone lobster wrote:

    Great strides on my favorite search engine. On a new browser I first change the “google” search box to point to Clusty, and I use “cluster” instead of “google” as a verb. (”I mean search with Clusty.com instead of Google,” I say. “It’s much better.”)

  16. K. Howard wrote:

    Only thing I miss is being able to refine results and search “within” original searcn term for more specific results.

  17. Web Developer wrote:

    Two years ago we were integrating open source search technology, with little success. Integrating the clustering algorithms of today, currently only viable in major statistics packages with web search is an impressive step.

  18. boxe wrote:

    Nice, another reason to keep using Clusty . . .

  19. Francois Chevalier wrote:

    have been pleased with vivisimo in the past.
    may try clusty.
    thanks
    FC

  20. C wrote:

    I was looking for a google alternative, and happened to find vivisimo.com. I have been using Clusty consistently and have not used google or the like really for anything in pretty close to two years now. Clusty searches information, not popular oppinion. Great Idea, Great work!

  21. darren wrote:

    clusty rules

  22. darren p wrote:

    i now use clusty instead of google

  23. Rob Young wrote:

    I’m surprised no one has mentioned Carrot2. Although the cluster is no where near as intelligent as that of Clusty it’s a good comparison and an excellent way of learning how clustering works.

  24. Cindy wrote:

    I am a die-hard Vivisimo/Clusty fan. I have been using it exclusively for approximately two years. No more Yahoo or Google for me. I have recommended it repeatedly to all the firms, companies and agencies that I do business with. Many of which have converted also. Kudos! I am so very pleased with your work and continued efforts.

  25. Philip Mugford wrote:

    I have used vivisimo for many years . to me, The new Clusty does not seem as nice as the original vivisimo.com. in particular each item had the ability to be displayed in seperate window so that when youe exit it you would always end up back with your Vivisimo original search but now when you exit everything is gone andyou have to start all over again and thus have to enter vivisimo.com to get it back.

    One feature I Liked was its ability to highlight your search words in all its search results. HOWEVER, I wish Vivisimo would retain this feature when I clicked on an item to display its contents that the highlighted search words would also be highlighted in the final result.

    for example. how often I have seen a vivisimo hit that looks exaclty what I was looking for but when I brought up that item the KEY words were no longer highlighted and I would spend hours in a large document trying to find those references in vein . It alwasy made me wonder why Vivisiomo would indicate the presence of those key words but I could not find them.

    so is it possible that this highighted KEY WORD feature could go down at least one more level,

    Also since your original vivisimo will soon no longer be available how can I adjust CLUSTY preferences to give me the ability to switch to a seperate window, so my original search results will still be there when I exit.

    Thanks - Phil

  26. raquel samper wrote:

    we are the jewish community center in murcia spain. we use vivisimo to find jews in spain in murcia . its amazing how you can re-present data and then its easily digested
    thanks
    rov tadot

  27. Brian R Leith wrote:

    I used Clusty once ~ it was enough for me. I immediately made it my home page and now use it exclusively. Clusty is a gem, it also comes before Google (rightly so, and alphabetically), in my “Favourites”. Why not more marketing? For a start the logo would make a great soft badage. Keep up the good work. Brian Leith

  28. Valkier wrote:

    Better than google, but clusty images still isnt as good as google images. weird. Plus, how much data does clusty collect, if any? that would be the nail in the coffin for google if im not being spied on anymore.

    all in all, great search engine.

  29. Claes wrote:

    Is it possible that our sites can use cluster.com as ose ?

  30. analogstuff wrote:

    I was using clusty search from about 2 years and i am impressed with its power though. Some times i got some good results for the search which even google did not show. Thanks for improving this search engine.

  31. kelly jenness wrote:

    I would like to get my website into Clusty - how can we do it?
    Is there a clusty bot or way to index it in your search engine.

  32. TastyBurp wrote:

    Very cool portal.
    Nice to see information clustered in a useful way.

    Now, if I could only get my PDA to work with me in the same way.
    (hint)

  33. satoshi wrote:

    This is a great tool, I have to admit.

    This makes it possible to find webs that cannot be found by google or yahoo, who may deliberately ignore some info.

    This shall be a step forward in the “No MS, no google” movement.

  34. AskApache wrote:

    Great informative article! I think clustering could be the coolest thing to hit search since google, thanks!

  35. Malter wrote:

    Best search engine. Clusty needs more webpages or say more number of search results though the first few results are great and satisfying. Still sometimes when i gavr generalized search word. Where it should should have to come with some most common results, but it wasn’t like that..So work on the most viewed or most generalized search and the site confronting should the popular one.

  36. betty down wrote:

    super search engine, ı like the this search engine.

  37. Bonc wrote:

    nice work, but was the example with kill, guns and murder really necessary?

    Bonc…

  38. aaron wrote:

    I’ve tried google, yahoo, live search, & even others like dogpile & ask but none compare to clusty. It is easy for researching & not confusing in anyway. Now its even better I don’t even know what to say.

  39. Murali wrote:

    Clusty tool bar for Firefox is one of the best tool I have ever seen. We are looking forward for this tool bar to have compatibility with future Firefox versions also.

  40. Dominic wrote:

    I’ve tried clusty and found that it may be suitable to use in one of the IT Projects of our University. Therefore, I want to know more about it and also the licensing model. I did leave my contact info in Vivisimo website, but so far still haven’t heard from your company. Could you kindly follow up for me? Thanks!

  41. alex wrote:

    if u want clusty to succeed u should make it as fast as google no matter where u r located in….. google works for most ppl because it’s regional customized

Leave a Comment