Robots.txt
.
How to Fix
In order to pass this test you must create and properly install a robots.txt file.
For this, you can use any program that produces a text file or you can use an online tool (Google Webmaster Tools has this feature).
Remember to use all lower case for the filename: robots.txt, not ROBOTS.TXT.
A simple robots.txt file looks like this:
This would block all search engine robots from visiting "cgi-bin" and "images" directories and the page “http://www.yoursite.com/pages/thankyou.html”
TIPS:
- You need a separate Disallow line for every URL prefix you want to exclude
- You may not have blank lines in a record because they are used to delimit multiple records
- Notice that before the Disallow command, you have the command: User-agent: *. The User-agent: part specifies which robot you want to block. Major known crawlers are: Googlebot (Google), Googlebot-Image (Google Image Search), Baiduspider (Baidu), Bingbot (Bing)
- One important thing to know if you are creating your own robots.txt file is that although the wildcard (*) is used in the User-agent line (meaning "any robot"), it is not allowed in the Disallow line.
- Regular expressions are not supported in either the User-agent or Disallow lines
Once you have your robots.txt file, you can upload it in the top-level directory of your web server. After that, make sure you set the permissions on the file so that visitors (like search engines) can read it.