| An Internet search engine is a software programme designed to search for information on the Internet. The search results are usually presented in the form of a list and are commonly called hits. The data may consist of web pages, images, data and other types of files. Some search engines also accumulate data available in databases or open directories. In comparison with Web directories which are maintained by human editors, search tools operate automatically or are a mixture of human and algorithmic input.
Internet search tools operate by storing information about many web pages which they retrieve from the INTERNET. These pages are retrieved by An Internet crawler, or differently called a spider. It is an automatically-controlled Web browser which follows every link it finds. After that the content of each page is analyzed to decide how it should be indexed. Words, for example, are extracted from titles, headings or special fields called meta tags. Data about web pages are stored in an index database for further use in queries. Some search tools, such as Google, save and store the whole or part of the source page (referred to as a cache) and information about web pages, while others, such as AltaVista, store every word of every page they have found. This cached page always holds the actual search text, as it is the one that was actually indexed. Consequently, it can be very helpful since it holds data that can no longer be available elsewhere.
When a user types search words in the search field, the software programme checks its databank and displays a listing of best-matching web pages in accordance with its parameters, normally with a short summary combined with the document's title and sometimes excerpts from the text. Some search tools provide an advanced option called proximity search that allows users to define the distance between search words.
The usefulness of a search engine hangs on the relevance of the result set it gives back. Since there can be millions of web pages containing a certain word or phrase, web pages can be grouped into relevant and irrelevant ones. The results can be ranked to display the "best" ones first.
The way a search engine displays web pages varies from one engine to another. The techniques also change with time, since the use of Internet services alters and advanced techniques are developed. |