No, Search is Not Broken
Earlier this month, Tom Foremski of Silicon Valley Watcher wrote a piece on search titled “Is Search Broken?” lamenting the efforts required of humans to make search better. It seems to me that Foremski does not show that search is broken, but merely that search can always be improved. As I argued in my previous post, the sky is the limit. Search is infinitely perfectible and “solving search” would amount to creating an omniscient being. So anything that can be used, should be used to make search better.
People and computers are good at very different tasks. People are very good at inferring a lot from just a little information (e.g., knowing that a blog post of a few lines is insightful). On the contrary, computers are good at making large amounts of information usable (e.g., by finding this one document among billions matching your keywords or giving an overview of thousands of documents through clustering). The kind of input highlighted in Foremski’s post is the human kind which will always be invaluable to computers.
I would love to support the implicit conclusion that “enterprise search” is smarter than web search because it does not need as much to provide the same results. Unfortunately, the reality is not as simple. Enterprise search also needs to leverage additional human provided (directly or indirectly) information (metadata): authors, dates, departments, source, access logs, etc. can and should be used (when available) to make the content more findable.
The big question raised by the article is “how much human effort” should be spent to improve search. The Web in that respect has three big advantages over the corporate world – the Web is:
• Self-motivated
• Self regulated
• Search aware
In other words, in the Internet world, people are motivated to make their content findable. The amount of effort they spend to do this is on par with their motivation and the effectiveness of the method, and they know that the primary way to access the information they produce is through search.
By contrast, in corporations, employees have a low incentive to share what they know (perhaps it makes them more replaceable?); consequently efforts to make their content more reusable are typically done through regulations and not self motivation. There is no guarantee that these regulations are effective, and more often than not, they aren’t effective as they are based on old, library-centric pre-search information access models based on static taxonomies and manual indexing.
I’m curious as to what Foremski would say about the billions of dollars spent by corporations to manage the way their employees produce information, “management” which has largely been an impediment to good enterprise search rather than the improvement it should have been.
In summary, the real problem is not that search engines require “human input” to be better, after all human input is what all valuable content is. Smart search engines should leverage all human inputs, and content (especially metadata) should always be generated with search in mind.
Technorati search for links to this article
Post this article to Digg (must be logged in)
Post this article to del.icio.us (must be logged in)
Post this article to Reddit (must be logged in)
Post this article to Furl (must be logged in)
Post this article to Spurl (must be logged in)