Robots.txt file

The robots.txt file is uploaded to your website's root folder. This file guides searchengine spiders by allowing or disallowing crawling of specific files and folders. It´s a URL blocking method and should handled with care.


User-agent: Googlebot 
Disallow: /folder1/ 
Allow: /folder1/myfile.html

The user-agent can be a wildcard * so all spiders/bots are affected. User-agent: *

In the example above we disallow the indexing of 'folder1' except for one file in that particular folder: 'myfile.html'

A good robots.txt for a site running on WordPress would be this:

User-agent: *
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /category/*/*
Disallow: */trackback

The WordPress core files are protected from indexing and also category and trackback pages won't be listed. A lot more can be added to the robots.txt file but this pretty much describes the most important facts. Note: Do not block your /feed/ url as this can be used as a sitemap

You can also try the robots txt file generator


  • Add your sitemap url to the robots.txt file: Sitemap:
  • If you're using WordPress disallow the core folders
  • Is it named properly (case sensitive!) and placed in your root folder?
  • Disallow 301\302 redirections and cloaked urls (i.e.  >> Disallow: /outgoing/*
  • If you are using subdomains each subdomain needs its own robots file
  • One rule per line


  • Once you uploaded the file to your website's root folder you can test it with Google's robots testing tool


It`s recommended to exclude specific pages via <meta name="robots" content="noindex"> instead of blocking the file with robots.txt. If the url in question gets backlinks from other pages the link juice is lost because robots.txt blocks the spiders. The meta tag still follows links and rewards your page.

If you want to exclude complete folders i.e. /tmp/ /private/ or similar it makes sense to add them to robots.txt

Liked the article?

sharing is a great way to say thank you: