It is important to create and optimize the robots.txt to make your Magento store secure and improve SEO.
The robots.txt ("robots dot text") is a text file that help Search engine robots (such as Google bot and Bing bot) to determine which information to index. By default there is no robots.txt in Magento Community or Enterprise distributive so you should create it yourself.
How robots.txt will improve your Magento?
This is just a few use-cases of robots.txt usage, so you will get a better idea why it is so important:
- The robots.txt will help you to prevent duplicate content issues (it is very important for SEO).
- It will hide technical information such as Errors logs, Reports, Core files, .SVN files etc from unexpected indexing (hackers will not be able to use Search engines to detect your platform and other information).
Robots.txt installation
Note: The robots.txt file covers one domain. For Magento websites with multiple domains or sub-domains, each domain/sub-domain (e.g. store.example.com and example.com) must have its own robots.txt file.
Magento Community and Magento Enterprise
Installation of robots.txt is easy. All you need is to create robots.txt file and copy the robots.txt code from our blog. Next, upload the robots.txt to the web root of your server, for example here: example.com/robots.txt.
If you will upload the robots.txt to sub-folder, e.g. example.com/store/robots.txt in this case robots.txt will be ignored by all search engines.
Magento Go
Installation of robots.txt for Magento Go is described in this Knowledge Base article.
Robots.txt for Magento
Here our recommended robots.txt code, please read the comments marked by # before robots.txt publishing:
## robots.txt for Magento Community and Enterprise ## GENERAL SETTINGS ## Enable robots.txt rules for all crawlers User-agent: * ## Crawl-delay parameter: number of seconds to wait between successive requests to the same server. ## Set a custom crawl rate if you're experiencing traffic problems with your server. # Crawl-delay: 30 ## Magento sitemap: uncomment and replace the URL to your Magento sitemap file # Sitemap: http://www.example.com/sitemap/sitemap.xml ## DEVELOPMENT RELATED SETTINGS ## Do not crawl development files and folders: CVS, svn directories and dump files Disallow: /CVS Disallow: /*.svn$ Disallow: /*.idea$ Disallow: /*.sql$ Disallow: /*.tgz$ ## GENERAL MAGENTO SETTINGS ## Do not crawl Magento admin page Disallow: /admin/ ## Do not crawl common Magento technical folders Disallow: /app/ Disallow: /downloader/ Disallow: /errors/ Disallow: /includes/ Disallow: /lib/ Disallow: /pkginfo/ Disallow: /shell/ Disallow: /var/ ## Do not crawl common Magento files Disallow: /api.php Disallow: /cron.php Disallow: /cron.sh Disallow: /error_log Disallow: /get.php Disallow: /install.php Disallow: /LICENSE.html Disallow: /LICENSE.txt Disallow: /LICENSE_AFL.txt Disallow: /README.txt Disallow: /RELEASE_NOTES.txt ## MAGENTO SEO IMPROVEMENTS ## Do not crawl sub category pages that are sorted or filtered. Disallow: /*?dir* Disallow: /*?dir=desc Disallow: /*?dir=asc Disallow: /*?limit=all Disallow: /*?mode* ## Do not crawl 2-nd home page copy (example.com/index.php/). Uncomment it only if you activated Magento SEO URLs. ## Disallow: /index.php/ ## Do not crawl links with session IDs Disallow: /*?SID= ## Do not crawl checkout and user account pages Disallow: /checkout/ Disallow: /onestepcheckout/ Disallow: /customer/ Disallow: /customer/account/ Disallow: /customer/account/login/ ## Do not crawl seach pages and not-SEO optimized catalog links Disallow: /catalogsearch/ Disallow: /catalog/product_compare/ Disallow: /catalog/category/view/ Disallow: /catalog/product/view/ ## SERVER SETTINGS ## Do not crawl common server technical folders and files Disallow: /cgi-bin/ Disallow: /cleanup.php Disallow: /apc.php Disallow: /memcache.php Disallow: /phpinfo.php ## IMAGE CRAWLERS SETTINGS ## Extra: Uncomment if you do not wish Google and Bing to index your images # User-agent: Googlebot-Image # Disallow: / # User-agent: msnbot-media # Disallow: /
Test your robots.txt
After robots.txt publication your can check its syntax using these on-line tools:
- http://webmaster.yandex.com/robots.xml
- http://www.sxw.org.uk/computing/robots/check.html
- http://www.frobee.com/robots-txt-check
Further reading


8 Comments
yans
13 Feb 2013 19:29:08i'll start from here :)
Thomas
7 Mar 2013 13:50:35The best robots guide for magento i'v seen to date.
It helped me alot.
Ryan
8 Apr 2013 12:05:33The sytnax check links you provided are reporting a # of errors relating to the user agent. Specifically, the errors are that "No User Agent. A Disallow line must have a User-agent line before it.". So, between each Disallow string, it's saying we need to specify the user agent as it will not carry the user agent wildcard from the top of the file. Is this information not up to date, or do we need to go back through and add a "Disallow: all" string?
Thanks guys!
Oleg
10 Apr 2013 01:21:59It is not necessary to specify user agent in each line. There are separate User-agent strings in the robots.txt.
You can validate the robots.txt using this nice tool: http://webmaster.yandex.com/robots.xml
Vladimir
16 Apr 2013 08:06:02I was wondering what would it take to have multiple robots.txt files in multi website environment?
Oleg
16 Apr 2013 22:31:56Yes, it is possible, if you have separate domains and also configured Magento multi-stores in separate folders with index.php, .htaccess and robots.txt. It is rather easy configuration.
Hendrik
8 May 2013 17:44:33Oleg
13 May 2013 01:35:14Could you please send exact error code from Google Webmaster Tools here?