TransWikia.com

How to de-index pages from google using robots.txt

Webmasters Asked by Genadinik on December 27, 2020

I added the no-index tag for the google-bot, but it is kind of taking a long time to de-index some pages since I have over 100k pages I need to de-index lol :)))

What is the proper way to de-index pages using robots.txt?

Thanks!

Any idea how long should it take?

2 Answers

Assuming these pages still exist, but you just want them removed from search results...

What is the proper way to de-index pages using robots.txt?

You wouldn't necessarily use robots.txt to de-index pages. ie. remove already indexed pages from the Google search results. A noindex robots meta tag in the page itself (or X-Robots-Tag HTTP response header) might be preferable instead, in combination with the URL removal tool in Google Search Console (GSC) to speed up the process.

robots.txt specifically blocks crawling (not necessarily "indexing"). By blocking these pages from being crawled, these pages should naturally drop from the search index in time, but this can take a considerable amount of time. However, if these pages are still being linked to then they may not disappear entirely from the search results if these URLs are blocked by robots.txt (you can end up with a URL-only link in the SERPS, with no description).

Using robots.txt to remove the https://www.example.com/getridofthis/ directory...

User-agent: *
Disallow: /getridofthis/

To remove pages entirely from SERPs consider using a noindex meta tag (or X-Robots-tag: noindex HTTP response header) instead of robots.txt. (Which is what it sounds like you are doing already.) Don't block crawling in robots.txt as this will prevent the crawler from seeing the noindex meta tag.

To expedite the process of de-indexing URLs in Google search you can use the URL removal tool in GSC (formerly Webmaster Tools). For this tool to be effective long-term you need to use the noindex meta tag in the pages themselves. (The original blog article stated that robots.txt could be used as a blocking mechanism with the URL removal tool, however, recent help documents specifically warn against using robots.txt for "permanent removal".)

Reference:

Answered by MrWhite on December 27, 2020

Take a look at robotstxt.org has all the info you need.

Answered by Matthew Brookes on December 27, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP