How to solve the webmaster error “ Some important page is blocked by robots.txt “

In google webmaster some times there will be an error such that search engines could not crawl your site due to robot.txt file in your site .

Which means that robot.txt will block all the search engines from crawling your site , but the main purpose of robot.txt is to block from spammers and from malware vulnerabilities .

To allow google to crawl your site on search you can specify like this .

User-agent: google

Disallow:

To allow all the search engines to crawl your site you can specify the code like this

User-agent: *

Disallow: /

To allow all search engines to crawl your site use the following code .

User-agent: *

Disallow:

The robot.txt also plays an important role in search engine optimization .  Things that should be considered when creating a robot.txt file  are :-

  1. The robot.txt file should be in the root directory only .
  2. Don’t give space at the start of the file.
  3. Ensure that the syntax there are correct.
  4. Try not to add comments , if you want add at end of the syntax .
  5. If don’t want to crawl specific files then add it in such a way that they are specified per directory at a line . And all the directory in a single line .

Also you can use the META TAG to specify the robots that it should not index or follow your links . And the meta tags are not considered by the malware vulnarabilities and spammers .

The meta robot tag will be like :

<META NAME=”ROBOTS” CONTENT=”NOINDEX,FOLLOW”>

For example for google you can use it as

<meta name=”googlebot” content=”noindex,follow”>

And the metatag for the msn is

<meta name=”msnbot” content=”noindex,follow” />

 

And this meta tag should be placed in the before /head in your coding .

Here the google bot is the famous web search bot used by google .  And also you can see the list of bots offered by google from here : –

https://developers.google.com/webmasters/control-crawl-index/docs/crawlers

And some of other famous spiders are : –

http://www.searchenginedictionary.com/spider-names.shtml

In some cases on wordpress you may get an error called  “Blocked by line 2: Disallow: / Detected as a directory; specific files may have different restrictions”

This type of errors usually occurs in wordpress which can be cleared by using the wordpress plugins.