How to block spam crawlers (spam bots) in Google Analytics?

In the last few years, there has been a sudden increase in the spam crawlers. Google analytics consultants have been aware of it and are still trying to figure out the answer to such a huge problem as it leads to a great noise in the accumulated data. However, there are a few methods by which such spam crawlers can be controlled and avoided.  

The primary mission of bots is to execute the task of crawling the page meticulously. These bots are further used for unethical and harmful purposes such as creating elements of spam on other websites. Some of the activities that are carried out by spam bots are as follows:

  • Increasing website traffic by spurious methods
  • Creating useless data
  • Exposing the sites and machines to the malware
  • Collecting email Ids and creating numerous fake accounts and domains

The list goes on. To begin with, when a user creates a Google Analytics account, he should create more than four properties. These properties will have their own tracking ID and help find the spam domains since they are not linking to the site’s ID.

There have been few tips and tricks explaioned by some of the best google analytics consultants to remove these bots and avoid the spam domains.

1. Referral Spam: There are two types of referrals:

Ghost Referral: A few spam domains that give the reference of a valid website to your page and will keep sending spam traffic without visiting it. Since their visit does not get recorded the standard ways of blocking them does not work.

Therefore, as mentioned above create a list of hostnames and put them in the block list. You can get this list easily of the Internet that shows active spam bots over the years.

You can also check the hostnames that are sending traffic by going to

Audience > Technology > Network > Hostname

Never underestimate the reference of famous websites as in most of the cases ‘ghost spam’ is activated through them.

After creating this list, open the google analytics account and go to

Administration > View filters > Edit Filters

Add the filter name, check the include button, give the field name Hostname and add the created list in the filter pattern. Save this filter.
 However, please note that the history of spam does not get removed by this method.

Non-Ghost referral: The bots from these domains actually make a visit to the websites hence the process to handle these bots varies from that of ghost ones. Several methods are available to handle such bots:

Filter: A different type of filter is used for such visiting domains. They can be removed by going to:

Administration > View Filters > Add filters to view

On this page create a new filter and give it a new name. Click on the exclude button in the filter type and click on the referrals for filter field. Add the URL of the spam domain (with the vertical bars if there is more than one) in filter pattern.

.htaccess: This method makes use of coding and it blocks the spam domains altogether.

ReWriteEngine On
ReWriteCond %{HTTP_Reffer}^http://Website name\.com/[NC,OR]

It is advisable to create a backup beforehand as any misplaced code might harm the site negatively.

Wp-Ban: Since wordpress is a vastly used CMS, it’s imperative that we talk about it over here. Wp-Ban is a plugin used by the WordPress users who do not prefer using .htaccess. It blocks the offending sites entirely using their IP address, URL or IP range.

Google Analytics:  Click on the checkbox for ‘Exclude all hits from known bots and spiders’ under the review setting page in Google Analytics. It will keep the well-known spam bots from crawling the web pages.

Other than these methods keeping alert about such issues help a lot:  

  • Use firewall and security system to avoid malware
  • Keep a tab on server logs
  • Check the site traffic for unusual spikes

The following links have a list of all those sites that are the known sources of spam.

This list will help in making it easy for users to determine the inappropriate domains and taking necessary measures to avoid any data noise.

Request a Free Quote
SEO Services