Semalt Expert On How To Prevent Referral Spam From Hurting Your Web Analytics

Web analytics is critical as it helps in measuring site activity. The Universal Analytics has more functionality than the old Google Analytics and users should implement UA more. However, the downside to using UA is that it receives a lot of referral spam. It is not a sufficient reason not to upgrade to it though. If one does not stop spam, it can seriously affect analytics, especially for SMEs. Lisa Mitchell, the Customer Success Manager of Semalt, describes how to overcome this annoying spam.

Referral Spam

Referral spam is considered as any non-human visits appearing on the analytics report. To review all referrer domain, open the Google Analytics report and select All Traffic from the Acquisition Tab. Referral traffic is the result of robots and spiders crawling the site, or robots sending codes to UA to create logs for a non-existent visit.

Why this is a problem and why you should care

Referral spam throws in extra visits to the site which do not occur. The result is that it messes with the information in analytics, and creates the wrong image about the site's performance. It results in high bounce rates and an understated conversion rate.

What is the point and why do they do this?

The objective behind referral spam is to get unknowing people to visit the source site. When these URLs appear on the analytics report, they target the owner's curiosity to know what is this content they have that generates so much traffic. One should never visit a site they do not recognize. The sites are relatively harmless as they are only looking to get organic traffic and boost their search ranking. But then again, just like any other spam, they could link back to a malicious site which is why one needs to avoid them in totality.

Types of Referral Spam

Before attempting to stop referral spam, one must understand the different forms it takes. They are essentially two: crawlers that visit the site, and robots that just send ghost referrals. Since they act differently, tackle them as such.

Crawlers

They disguise themselves as legitimate websites and follow links with the intention of crawling the site. They mostly come in the form of programs and attempt to visit all the sites on the page. Legitimate crawlers will find information that helps make the web easier to use. Shady crawlers will only crawl the web so that they leave their URL so that they get a backlink to their site. Block these using the .httaccess file or set custom filter in Google Analytics.

Ghost Referrers

These are programs as well but are different from crawlers in the manner they operate. There is a measurement protocol in Universal Analytics which makes it possible to measure and monitor offline activities. Some individuals with malicious intent take advantage of this and send out random data to Google Analytics IDs. They throw as much of the data they can to increase the chance of getting a hit. If they manage to get a hit, it records as a visit and includes the referral source to ensure that some people follow the source back to the referrer site.

Ghost Events

ome new bots now send Analytics Event information. To see if any Ghost Events show up, open behavior events and navigate to the top events report. It is an attempt to lure novice analytics users to visit their site.

Fighting Referral Spam

The .htaccess file edit does not work for Ghost Referrals and Ghost Events. Filter these domains using custom filters or custom segments in Google Analytics.

Filters for Ghost Referrers

Focus on the fact that ghost referrers do not know what the website is all about. The hostname is what the visitor uses to reach the site. A version of the site's hostname appears on the Google Analytics report. However, ghost referrers list as (not set) or a name of a website. Find the list of all hostnames by setting a time range such as two years, click on Technology, then Network. The primary dimension should be the Hostname. It will bring back the results of all hostnames that visited the site for the past two years.

Setting Up the Filter

Set a list of all the hostnames you wish to allow. Then open Google Analytics, head to the Admin section, and under View, click on Filters. Create a new filter and give it a new name such as "Valid Hosts" and under Filter type, leave it at Custom. Select include and choose hostname in the Filter Field. Enter all the valid hosts separating each by a vertical bar. Save the filter and leave the "Case Sensitive" checkbox blank.

When doing all this, make sure to have a separate filter as a "Test" view with the source data and for comparison purposes.

Filters for Crawlers

Add crawlers to a list that you want to have excluded. It follows the same procedure as that of Ghost Referrers. The only difference is that instead of "Include," choose to Exclude the Campaign Source in the Filter filed. Input the list of crawlers separating them with a vertical bar.

Identifying Crawlers

They record their own sessions with a 100% bounce rates and a single page per session. They show 100% new users.

Filters vs. Segments

Filters keep other data out of that particular view entirely. It only works from date of creation towards the future. Analyzing old data will require the use of segments instead.