Jerome Pesenti

No, Search is Not Broken

Earlier this month, Tom Foremski of Silicon Valley Watcher wrote a piece on search titled “Is Search Broken?” lamenting the efforts required of humans to make search better. It seems to me that Foremski does not show that search is broken, but merely that search can always be improved. As I argued in my previous post, the sky is the limit. Search is infinitely perfectible and “solving search” would amount to creating an omniscient being. So anything that can be used, should be used to make search better.

People and computers are good at very different tasks. People are very good at inferring a lot from just a little information (e.g., knowing that a blog post of a few lines is insightful). On the contrary, computers are good at making large amounts of information usable (e.g., by finding this one document among billions matching your keywords or giving an overview of thousands of documents through clustering). The kind of input highlighted in Foremski’s post is the human kind which will always be invaluable to computers.

I would love to support the implicit conclusion that “enterprise search” is smarter than web search because it does not need as much to provide the same results. Unfortunately, the reality is not as simple. Enterprise search also needs to leverage additional human provided (directly or indirectly) information (metadata): authors, dates, departments, source, access logs, etc. can and should be used (when available) to make the content more findable.

The big question raised by the article is “how much human effort” should be spent to improve search. The Web in that respect has three big advantages over the corporate world – the Web is:

• Self-motivated
• Self regulated
• Search aware

In other words, in the Internet world, people are motivated to make their content findable. The amount of effort they spend to do this is on par with their motivation and the effectiveness of the method, and they know that the primary way to access the information they produce is through search.

By contrast, in corporations, employees have a low incentive to share what they know (perhaps it makes them more replaceable?); consequently efforts to make their content more reusable are typically done through regulations and not self motivation. There is no guarantee that these regulations are effective, and more often than not, they aren’t effective as they are based on old, library-centric pre-search information access models based on static taxonomies and manual indexing.

I’m curious as to what Foremski would say about the billions of dollars spent by corporations to manage the way their employees produce information, “management” which has largely been an impediment to good enterprise search rather than the improvement it should have been.

In summary, the real problem is not that search engines require “human input” to be better, after all human input is what all valuable content is. Smart search engines should leverage all human inputs, and content (especially metadata) should always be generated with search in mind.

Trackbacks & Pingbacks

What is a Trackback? What is a Pingback?

No trackbacks yet.

Discussion

  1. Barbara Wagner wrote:

    To Jerome Pesenti

    You said “they are based on old, library-centric pre-search information access models based on static taxonomies and manual indexing”

    Since the 1960s I have been and continue retrieving information via computers (and otherwise) inside and outside of libraries. At Western Reserve University, I studied with founders of information science before they went to Pittsburgh, using first-generation computers and IBM card readers–but it wasn’t about the equipment. I learned about information retrieval when the first professional information science association was still called the American Documentation Institute and before there were courses in computer science. And I was searching the Net before the Web existed — long before the verb became a noun.

    I know that searching techniques have not been static nor have they relied on manual indexing; they have evolved immensely and still do, as your studies should have shown you.

    Your term “library-centric pre-search information access” implies that searching did not exist before computer scientists “invented” it using the Internet. That is not true.

    Library-centric is not “pre-search.”

    Librarians and information scientists may have used some different terminology for some of your IT-based terms, but they understood and understand how the search process works — whether computer-based or not. Plus they understand user needs and their searching behavior.

    To use your terminology, librarians are the interface between users and mechanisms for retrieving information that guide them to use those tools most effectively. I see this every day. You can have the most sophisticated computer algorithms for searching the Web, but that does not mean users can automatically use them effectively on their own. Librarians have to make the
    abstract searching theories work in the real world;

    I do appreciate Clusty as I appreciated Lycos and Northern Light and other innovative search engines, because they make my work at the “user interface” easier.

  2. Jerome Pesenti wrote:

    Barbara, by “library” I just meant a “controlled set of content” and did not suggest that it meant “pre-search” (I used the two terms in combination, not redundantly). I am also aware that information retrieval has been around for a very long time (the term was coined in 1950!), that librarians have been using search tools long before I did and that they are often today the biggest proponents of introducing modern search tools in corporations and organizations.

    The whole point of my post was to entice corporations to invest more on search technology and information access rather than on controlling content generation.

  3. Raya wrote:

    I agree with Jerome

  4. Paul Brown wrote:

    I agree with Jerome, but would add that as search technology improves, it will become more ‘human’ centered.

  5. Don Jones wrote:

    “I agree with Jerome, but would add that as search technology improves, it will become more ‘human’ centered”

    Google would certainly have you believe this

  6. Waleed wrote:

    Yes you are absolutely right, nothing is impossible for Google !!

  7. izwan wrote:

    well, I’m agree with you. most search engine now give more efforts to improve their search engine such as google. currently, google give some penalty to most directory typed website.

  8. Success wrote:

    Google are continuously striving for the “perfect search model” and have done a pretty good job but it is far from perfect. The Web 2.0 model gives some insight into how things will evolve. You are right to say that search engines should take account of all human input and humans must always put the searcher uppermost in their minds when deciding upon search criteria for their sites. It’s a two way street.

  9. Alan Crawford - Tampa, Florida wrote:

    Judging by the price of their stock … they sure are doing something right.

Leave a Comment