HOME | ABOUT | CONTACT |

Search

Custom Search

Saturday, January 24, 2009

What Should be Included in a Robots.txt?

A robots.rxt is a text file used to control your blog/website. If using properly, it can prevent issues such as duplicate meta tags, duplicate title tags, and even save your web server some bandwidth.

How to create a robots.txt?

You can create a robots.txt using any text editor and named it exactly as robots.txt. I just use a simple text editor such as Notepad.

How to use it?

You may use FTP to upload a robots.txt to your website's root directory. For example, mine is uploaded in my website's root directory as /public_html/. My blog is in an add-on domain so a robots.txt is uploaded in its directory as /public_html/wp/.

What Should be Included in a Robots.txt?

For my Joomla website, I've included a robots.txt as follows. Some of them are used to prevent search engines to crawl my forum because they created duplicate title tags:

User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /editor/
Disallow: /help/
Disallow: /includes/
Disallow: /language/
Disallow: /mambots/
Disallow: /media/
Disallow: /modules/
Disallow: /templates/
Disallow: /index.php
Disallow: /index2.php
Disallow: /log-in.html
Disallow: /create-an-account.html
Disallow: /lost-password.html
Disallow: /contact-us/
Disallow: /contact-us/name.html
Disallow: /forum/*?
Disallow: /fourum/statistics.html
Disallow: /fourum/view-last-messages.html
Disallow: /option/
Disallow: /profiles/
Disallow: /*action=reminder*/
Disallow: /*action=stats*/
Disallow: /*;msg*/


For my WordPress blog, I've included a robots.txt as follows:
User-agent: *
Disallow: /wp-*
Disallow: /feed/
Disallow: /trackback/
Disallow: /rss/
Disallow: /comments/feed/
Disallow: /page/
Disallow: /date/
Disallow: /comments/
Disallow: /cgi-bin/
Disallow: /2007/
Disallow: /2008/
Disallow: /*?*
Disallow: /iframes/
Disallow: /recommends/

User-agent: Googlebot-Image
Allow: /wp-content/uploads/

sitemap: http://www.learningwp.com/sitemap.xml


Feel free to talk about what you've included in your robots.txt.

Related article
How to Use a Robots.txt to Control Your Website?

2 comments:

Lilly said...

just checking out your imbedded comment form...

Lana said...

Thanks for visiting!
Anything about robots.txt you want to talk about?

Post a Comment

Search

Custom Search