Robots.txt

.

How to Fix

In order to pass this test you must create and properly install a robots.txt file.

For this, you can use any program that produces a text file or you can use an online tool (Google Webmaster Tools has this feature).

Remember to use all lower case for the filename: robots.txt, not ROBOTS.TXT.

A simple robots.txt file looks like this:

User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /pages/thankyou.html

This would block all search engine robots from visiting "cgi-bin" and "images" directories and the page “http://www.yoursite.com/pages/thankyou.html”

TIPS:

  • You need a separate Disallow line for every URL prefix you want to exclude
  • You may not have blank lines in a record because they are used to delimit multiple records
  • Notice that before the Disallow command, you have the command: User-agent: *. The User-agent: part specifies which robot you want to block. Major known crawlers are: Googlebot (Google), Googlebot-Image (Google Image Search), Baiduspider (Baidu), Bingbot (Bing)
  • One important thing to know if you are creating your own robots.txt file is that although the wildcard (*) is used in the User-agent line (meaning "any robot"), it is not allowed in the Disallow line.
  • Regular expressions are not supported in either the User-agent or Disallow lines

Once you have your robots.txt file, you can upload it in the top-level directory of your web server. After that, make sure you set the permissions on the file so that visitors (like search engines) can read it.

loading...