Monday, April 19, 2010

Google Crawling, Caching and Indexing

What is the difference between google crawling, indexing and caching ?

A website get CRAWLED is just simply mean that a bot/spider visit your website. One can block the website owned from getting crawled. Once the bot/spider crawled your website then the role of programme running behind the bot/crawler comes that whether to store the particular webpage in database or not ? This is known as INDEXING. Every search engine has its own parameters for indexing. Generally what actually observed was the content based sites get indexed much easily.

It is not necessary that the page get crawled also get indexed. Crawling is completely different from indexing. A website can be crawled on daily bases but get indexed only after certain interval or after fulfilling the desired conditions.

One can take this example:

A webpage is indexed in google. On a particular day you just change 2-3 line content of that webpage. After 2-3 days you will chk that the the webpage get crawled but not indexed with new content, it remain indexed with old unchanged content.

This all done because the bot doesn' t found that content, much important from webpage point of view. In that case the bot indexed the page with new modified content after a long interval of time as it gets programmed. To get the same indexed webpage to reindex the bot needs significant changes on the webpage.

One can speed up the process of indexing by getting good quality high PR Backlinks.

CACHE is the storage house in which google store the indexed page. On SERP google shows CACHE link with every link who is indexed. On clicking that one can chk the date and the page which is in the memory of google.

One can use the following syntaxes

cache:xyz.com - to chk the caching information
site:xyz.com - to chk the pages indexed