In Search Of...

By
31 August 2001 07:02 PM
Tags: search, query, result, site

From the capital of Tugo to a Hang Seng IPO, it's on the Web--if you can only find it. We review 30 search engines that make the hunt easier.

Looking for the actor who played Pentangeli in Godfather II? Rainfall patterns in the Amazon? The e-mail address of the high-school sweetheart you've lost touch with? All of that is on the Web--buried among two billion indexable pages.

PC Magazine Auz

Two billion is an impressive number, but even more impressive is that no matter how obscure a fact you're looking for, it's usually not that hard to find.

That's where search engines come in. The better ones have been diligent about keeping up with the explosion of Web pages, "crawling" them via agents. These tools examine and index pages' content, or, when the entire page cannot be processed, the metadata hidden in the sites' html tags. Google boasts the largest database, with 560 million fully indexed and 500 million partially indexed pages. AltaVista, fast Search, and Northern Light each claim over 300 million.

Size isn't everything, however. Many factors contribute to a search engine's usefulness, from the algorithms it uses to process the information its crawlers dig up to its interface. For this article, we put 30 general-purpose search sites through their paces, checking how well they responded to a wide range of search phrases, including "open-source software", "Buffy the Vampire Slayer", "what does WAP stand for?" and "trade relations with China". No matter what we searched for or where, one thing was clear: search sites need to standardise query syntax. Though most of the sites we tested let you do phrase searches and use plus or minus signs to include or exclude terms, there are varying degrees of support for advanced features such as Boolean queries (the ability to specify and, not, and or), nested Boolean queries (the ability to search for tennis and [Rafter or Hewitt], where all the answers will include tennis and at least one top Aussie player), domain queries (answers must come from dot-org sites, for example), and wildcards. A suggestion: if you use a site regularly, spend a few minutes reading the advanced search tips.

Search engines are statistical systems. They return exact and partial matches based on a document's probable relevance as calculated by the engine's search algorithm. Though Web directories such as the Open Directory Project or Yahoo! tend to do better with broad-topic searches such as online mortgages, search engines--particularly those with large databases--they respond better to specific searches. This is because the directories are hierarchical lists that categorise the Web like a library's card catalogue. They work well for the big picture, but the farther down you drill, the less likely they are to have indexed the page you need. Another relevance test we examined is a site's ability to target the specific home page of a site. If you type in: Ford and car you'd want www.ford.com to show up on the first page, certainly before a site where someone is selling his own used Falcon.

Unless you have a unique search term, use as many words as possible to describe your query. For instance, a search for Carnivore brings up everything from the video game Carnivores 2 to an open-air specialty-meat restaurant in Nairobi. Adding fbi to a query helps narrow results to the controversial e-mail surveillance system of the same name. One of the most effective ways of improving search results is to put quotation marks around words that should be searched as a phrase.

Specialised search engines and metasearch engines can be useful for narrowing your focus or casting your net wider. Specialised engines concentrate on more specific data sets, such as corporate information. Metasearch engines send your query to multiple search engines and collate the results. See the sidebar "Metasearch and Specialised Search" for more.

Once you get the drill down, search engines can be amazingly discerning tools for sorting the gold from the pyrite on the Web--and if you don't know what pyrite is, try typing +gold +pyrite into the engine of your choice.

Talkback 1 comments

    Matilda Keith -- 04/01/06 (in reply to #120126443)

    The Correct address for Matilda is
    http://www.matildasearch.com
    Is theee any chance you coulds replace your content with a simple link?
    Keith

Reviews by category

Sponsored content

Power Centre - Content from our premier sponsors

Blogs

  • Brad Howarth The key Topik is always money
    One of the big problems of the internet is that is practically impossible to keep up-to-date on preferred topics. You can limit your sources, but this can mean missing a lot of valuable data.
  • Array Google open-sources JavaScript tools
    Google announced overnight the release and open-sourcing of a trio of tools designed to help JavaScript developers.
  • Array Do we need the legislative blackmail?
    Virtually everyone in the telecommunications industry has their say in the Senate Standing Committee's public hearing into the pending legislation to split up Telstra, in this week's Twisted Wire podcast.
  • More blogs »

Tags

Back to top

Featured