[Search Engine](EN) Difference between crawling and indexing

Post about difference between crawling and indexing in search Engine


Environment and Prerequisite

  • Web


Crawling and Indexing

Crawling

  • Crawling: Finding web pages or contents in web using crawler or bots.
  • Each search engine company has its own crawling bot which crawls web pages.
  • Prevent from crawling by using robots.txt file in site root.


Indexing

  • Indexing: Read content of discovered web page or content and save it to search engine in well organized format.
  • Each search engine company indexes discovered web page or content in well organized format.
  • Prevent from indexing by using <meta name="robots" content="noindex"> tag in <head></head> tag.
<head>
<meta charset="utf-8">
...
<meta name="robots" content="noindex">
...
</head>


Caution

Issue

  • Even though page has <meta name="robots" content="noindex"> tag, noindex tag may not work if it is blocked in robots.txt because its page cannot be checked.

Solution


Reference