Chapter 8: Search engine

Search Engine

How do search engine work?

        Internet search engines are special sites on the Web that are designed to help people find information stored on other sites. There are differences in the ways various search engines work, but they all perform three basic tasks:
  • They search the Internet -- or select pieces of the Internet -- based on important words.
  • They keep an index of the words they find, and where they find them.
  • They allow users to look for words or combinations of words found in that index.
        Early search engines held an index of a few hundred thousand pages and documents, and received maybe one or two thousand inquiries each day. Today, a top search engine will index hundreds of millions of pages, and respond to tens of millions of queries per day. In this article, we'll tell you how these major tasks are performed, and how Internet search engines put the pieces together in order to let you find the information you need on the Web.

         Without them, it would be virtually impossible to locate anything on the Web without knowing a specific URL. But do you know how search engines work? And do you know what makes some search engines more effective than others?
When people use the term search engine in relation to the Web, they are usually referring to the actual search forms that searches through databases of HTML documents, initially gathered by a robot.

Did You Know...         The first tool for searching the Internet, created in 1990, was called "Archie". It downloaded directory listings of all files located on public anonymous FTP servers; creating a searchable database of filenames. A year later "Gopher" was created. It indexed plain text documents. "Veronica" and "Jughead" came along to search Gopher's index systems. The first actual Web search engine was developed by Matthew Gray in 1993 and was called "Wandex".

