[Search Engine](EN) Difference between crawling and indexing

April 01, 2021

Reading time ~1 minute

[Search Engine](EN) Difference between crawling and indexing

Post about difference between crawling and indexing in search Engine

Environment and Prerequisite

Web

Crawling and Indexing

Crawling

Crawling: Finding web pages or contents in web using crawler or bots.
Each search engine company has its own crawling bot which crawls web pages.
Prevent from crawling by using robots.txt file in site root.
- Example: https://www.google.com/robots.txt

Indexing

Indexing: Read content of discovered web page or content and save it to search engine in well organized format.
Each search engine company indexes discovered web page or content in well organized format.
Prevent from indexing by using <meta name="robots" content="noindex"> tag in <head></head> tag.

<head>
<meta charset="utf-8">
...
<meta name="robots" content="noindex">
...
</head>

Caution

Issue

Even though page has <meta name="robots" content="noindex"> tag, noindex tag may not work if it is blocked in robots.txt because its page cannot be checked.

Solution

If page has <meta name="robots" content="noindex"> tag then remove from robots.txt file to be read from crawler to check <meta name="robots" content="noindex"> tag.
Related Link: https://developers.google.com/search/docs/advanced/crawling/block-indexing?hl=ko

Reference

Read More

[CI/CD] GitHub Actions actions/github-script에서 별도 파일을 실행하기

Published on May 01, 2025

[CI/CD](EN) Usage of running a separate file in GitHub Actions actions/github-script

Published on May 01, 2025

[MySQL] 데이터베이스 조작 시 TRANSACTION 사용하기

Published on January 01, 2025

comments powered by Disqus