A search engine is a specific (physical and software) machine that is responsible for indexing webpages so that users can perform searches by placing keywords in a search form.
Robots (software) called spiders are responsible for crawling the web by recursively following the links of millions and millions of webpages and indexing the content in gigantic databases so that they can be examined.
No search engine can reach every webpage in a day because the indexing process generally takes several weeks. Each engine adopts its own strategy and some even calculate how often websites update their content.
When search engine users fill out the form, they specify the words they are searching (and possibly those that they are not searching for) with the help of boolean operators such as "and", "or", "no", etc. (symbolised by +, -, etc.). The request is sent to the search engine. It searches through its databases for each of the words and then refines the search by removing the pages that do not meet the criteria. Finally, the engine displays a list of links to pages along with the beginning of each page's text or the text specified by the page's creator with special tags called meta tags.
These responses are listed by order of relevance, corresponding to the criteria that are unique to each search engine, such as percentage of words that meet the search, their density index (number of occurrences of each of the keywords on the page), etc.
Last update on Thursday October 16, 2008 02:43:14 PM.