Friday, 13 December 2019

How to block google yahoo bing and other search engine bots usinghtaccess ? | mod-rewite tutorial | amitoverflow blog

Block search engine bots using htaccess

With htaccess it is possible to block search crawler bots such as Google , yahoo and bing bot or other web crawlers from accessing your website.
If you don't want your website to be visible to search engines then you can easily deny access to web crawling bots using RewriteRule directive in htaccess.

htaccess code to deny access to google, yahoo and bing bots

The following is a tested and 100% working code so you can just copy and paste it to your .htaccess file .

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} google|yahoo|bing [NC]
RewriteRule ^.*$ - [F]
This rule will show a forbidden (403) error for "search bots" when they try to visit any URLs on your website.

explanations :
  1. The first line RewriteEngine on turns the RewriteEngine on. This tells server to read RewriteRule(s) in htaccess file. You can remove this line if your htaccess already contains one and place the rule just bellow it because Only one RewriteEngine directive is needed per htaccess.
  2. The second line RewriteCond %{HTTP_USER_AGENT} google|yahoo|bing [NC] is a condition for the RewriteRule. Here we match against the "HTTP_USER_AGENT" string using using a regular expression based pattern that matches those three bots. Since we dont know the exect user agent string so we can just use their name like "google" ,"yahoo" ,"bing" the pattern will automatically match the full user_agent string.
  3. The third line RewriteRule ^.*$ - [F] rewrites all requests to F it is short for forbidden or 403 error . If the condition above the rule is met (http_user_agent ===google OR yahoo OR bing) then this rule will get triggered.
    You can replace the F flag with R=403 or any other error status.
    The pattern (.*) means the entire site. If you want to block search crawler bots from reaching a specific file or path on your server then you can change the pattern accordingly ie ^demo.php$ applies only this specific file.

Block a specific user agent

To block a specific search bot for example googlebot,you can use the following:

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} google [NC]
RewriteRule ^.*$ - [F]

