|
How Do Search Engines Work - Web Crawlers
It is the search engines that finally bring your website to
the notice of the prospective customers. Hence it is better to know how these
search engines actually work and how they present information to the customer
initiating a search.
There are basically two types of search engines. The first
is by robots called crawlers or spiders.
Search Engines
use spiders to index
websites. When you submit your website pages to a search engine by completing
their required submission page, the search engine spider will index your entire
site. A ‘spider’ is an automated program that is run by the search engine
system. Spider visits a web site, read the content on the actual site, the
site's Meta tags and also follow the links that the site connects. The spider
then returns all that information back to a central depository, where the data
is indexed. It will visit each link you have on your website and index those
sites as well. Some spiders will only index a certain number of pages on your
site, so don’t create a site with 500 pages!
The spider will periodically
return to the sites to check for any information that has changed. The frequency
with which this happens is determined by the moderators of the search engine.
A spider is almost like a
book where it contains the table of contents, the actual content and the links
and references for all the websites it finds during its search, and it may index
up to a million pages a day.
Example: Excite, Lycos,
AltaVista and Google.
When you ask a search engine
to locate information, it is actually searching through the index which it has
created and not actually searching the Web. Different search engines produce
different rankings because not every search engine uses the same algorithm to
search through the indices.
One of the things that a
search engine algorithm scans for is the frequency and location of keywords on a
web page, but it can also detect artificial keyword stuffing or
spamdexing. Then the algorithms analyze the way that pages link to other
pages in the Web. By checking how pages link to each other, an engine can both
determine what a page is about, if the keywords of the linked pages are similar
to the keywords on the original page.
|