Whether you’ve got a new website under development, or are working on new web content for an existing website, you need to stop your test/development content getting search engine indexed for a number of reasons – including SEO reasons.
Many web developers are unaware of how best to do this, so here’s our rundown of the main ways you can prevent indexing:
(1) robots.txt can be used to block crawlers, yet this isn’t a reliable method because search engines will still crawl your test/development content, so your URLs may still be indexed.
You must ensure any robots.txt put in place to block crawlers, from content you want indexing, is removed before your test content goes live.
(2) Robots Meta Tags can be applied to all/specific pages you don’t want indexing – this method is recommended by Google, and will help prevent your test/development content being indexed.
These meta tags must be removed from content you want to index, before your test/development content goes live.
(3) Password Protecting your test environment is one of the most reliable ways to prevent search engine crawlers from indexing your test/development content.
(4) Giving access to specific IP addresses only allows test environment access to specific external IP addresses – e.g. your own IP plus the IPS of your web developers, whilst blocking search engine IPs, thus preventing indexing of your test/development content (this method doesn’t work with dynamic IPs!).
Here at the Word Waiter, we like to be careful, so we usually put both Robots Meta Tags AND Password Protection in place, though we sometimes use other methods, depending on the project.