Colin Dean

It’s All About the Metadata, Baby: Search Engine Optimization On The Intranet

Search engine optimization (SEO) is a hot topic these days. SEO consultants are paid well for their expertise in telling a company how to improve its web search engine ranking. A high rank often means more traffic, which means more business, which means more money.

However, SEO consultants focus primarily on the World Wide Web. After all, it’s only the public that needs to find information on the company’s website, right?

Wrong. This fact is well established. Vivísimo and other vendors entered the enterprise search market years ago and their growing number of clients shows that employees need to find data within the company’s intranet sites and storage networks using a unified, easy-to-use interface.

One thing SEO consultants often stress is the use of metadata on a page. A search engine uses metadata – literally data describing data – as a way of presenting to the searcher a description of the content without having to analyze the content to extract information such as an author, date of last modification, subject, keywords or a snippet. This can reduce the amount of time it takes to index a page or site. More important, it improves the relevance of a result by optimizing the document for the way that search engine crawlers and indexers work. Metadata is thus a key part of search engine optimization.

Web crawlers such as Google and Yahoo are generally not customized to crawl a specific website or document format. These crawlers extract as much information from a document’s metadata as possible, e.g. the <meta> tags in HTML or the information embedded in PDF files. These crawlers probably cannot read an obscure format document or a customized XML format contrived in house.

This is where enterprise search engines like the Vivisimo Velocity Search Platform excel – behind the corporate firewall. They can be configured to read an endless number of formats. An administrator can even customize the solution to pluck metadata from a dynamic area of a page so that even pages that lack header metadata can have extended metadata displayed in a search result.

Corporate web designers can assist their administrators by using HTML class or id attributes to define key areas where metadata or other information that would be useful to a searcher resides. A well-structured, database-backed content management system (CMS) is beneficial, primarily because it can be crawled in two ways: Web and database. Velocity can obviously crawl web sites, but it can crawl databases even faster.

A major hurdle for any search engine crawler is Javascript links. Some CMS solutions, both commercial and homegrown, use Javascript links for navigation. Velocity and other search solutions can handle Javascript links, but doing so requires some extensive configuration and testing. It is much easier and less time-consuming to use a CMS system that uses standard anchor tags with href attributes.

Another major hurdle for crawlers is Adobe Flash content. While Velocity  can index Flash content, it is often at the mercy of the Flash content designer. Extensive use of Flash reduces the crawl-readiness of a site, but fortunately, the presence metadata on the HTML page used to display the Flash content can offset the inherent precariousness.

Human-friendly URLs, shortened URLs for easy vocal sharing and canonical URLs when content can be accessed through several different URLs are examples of important coding and design practices that every SEO will encourage. However, general content policies such as providing metadata on all content such as HTML, PDF, word processor and spreadsheet documents, including most important optimization of all – excellent content – can improve the relevance of intranet searches also.

As evident, SEO is not just for the Internet. Using general SEO practices on intranet sites is just as important and is often overlooked. The heuristics of SEO are just as applicable behind the firewall.

Trackbacks & Pingbacks

What is a Trackback? What is a Pingback?
  1. The Flow of Consciousness » Blog Archive » Vivísimo’s Search Done Right blog posts my Intranet SEO article wrote:

    [...] Vivísimo’s Search Done Right™ blog posted an article I wrote a while ago entitled It’s All About the Metadata, Baby: Search Engine Optimization On The Intranet. [...]

Discussion

  1. Charlie wrote:

    Metadata is a very important aspect of both sites and files such as images, PDFs, and even music. One bonus of an intranet-only search engine is that you can trust the metadata; some of the more shady public-site webmasters will add meta content that doesn’t match the page, just to try to get a higher ranking.

    Back to enterprises and intranet sites however, maybe Vivisimo’s technology will change how IT professionals and end users see and utilize the data about their data.

  2. Rob wrote:

    Two quick hits: 1) intranet designers and engineers must automate the ability to quickly setup meta tags with viable and descriptive values; I do this all the time with automated content 2) even more important than automated meta values is the lifecycle feedback on “how” search results are used and “what value” is put back into the search realm e.g. ranking search results either manually (via vote) or automatically (via popularity) goes a long way, esp. when there is not meta data available. I hear Vivisimo does a very good job with this, but I don’t have hands-on experience with it.

  3. | Pretty Women Gallery wrote:

    Search engine optimization is indeed an important strategy to increase traffic into your website. If you have an online business, more visitors mean higher possibility of buyers. And if you have a monetized blog, more visitors mean higher possibility of more people clicking on the ads. I do SEO for my websites and blogs. And this has helped me a lot in my success.

Leave a Comment