Ghosts in the Machine: How to Combat Ghost Referral Spam in Google Analytics

6.29.15---Ghost-ReferralsI admit it, I’m a numbers guy.  I realize that some of you reading this cannot fathom how anyone could enjoy spending hours looking at spreadsheets, analyzing data, ferreting out trends and putting together reports, but I truly enjoy it.  Most of the time.

When the data is in our client’s favor and showing successes in their marketing efforts, I am elated; when the data tells a different story I am consternated but still enjoy the process of finding the solution to whatever is wrong.  But when the data I am given to analyze is unreliable and unable to truly tell me anything, well, that’s when it gets really frustrating.

Referring to the Issue at Hand

One of the tools I rely most heavily on in my day-to-day work is Google Analytics.  There is so much information to uncover, so many layers of data to tease out from within its segments and dashboards, and so many times when spending an hour or two chasing the numbers has eventually led me to a solution I could never have reached otherwise.  It is a remarkable and powerful tool.

It is also remarkably easy to spoof data in GA, and in recent months the ever-present issue of Spam Referral Data has become a major headache, thanks to a noticeable increase in the amount of ghost referral spam travelling around the Internet.

What Is It?

Of the many channels of website traffic, Referral Traffic is among the more actively sought and highly valued.  Referral Traffic, as its name implies, consists of visitors to your website who have arrived via links from other websites.  The volume of referral traffic your site sees is, in some ways, a measure of how popular your site is.  When Google sees a number of sites linking to yours, it raises your Authority scores, which in turn helps your positioning in search results.

Inexperienced Analytics users will occasionally see sudden influxes of hundreds upon hundreds of site visits from unfamiliar websites, and will be elated thinking a link to their site has potentially gone viral.  Unfortunately, what they have experienced is actually the tell-tale signs of a spam referrer.

Spam referrers vary in their methods and intents.  In general, they all hope you’ll see their site name (or the name of the site they are posing as), wonder what caused them to link to you, and click to visit the link.  In many cases, these spammers are less-than-ethical “black hat” SEO or marketing concerns who have sold a “guaranteed quality traffic” package to the website they spoof themselves to appear to be.  You click and land on that site’s page, and you’re more than likely going to spend a few moments clicking around the site trying to figure out where the non-existent link to your site is located.  By the time you give up, your visit has been tallied as traffic with a decent session duration and maybe even multiple page views.

Other spam referrers are not in it to make money, but are hoping to entice your click for more nefarious purposes, such as passing malware, Trojans or viruses on to your machine.  For this reason, your best practice is to never click on an unfamiliar referrer’s link in Google Analytics.  Doing a Google search of the referrer will quickly tell you whether you’ve got a spammer.

An Ounce of Prevention

The best way to combat spam referrers is to set up a filtered view within your Google Analytics account which screens out the spammers.  If you manage several accounts through Analytics, setting up a block at the server level can often be more efficient.  You need to be constantly vigilant, however, because the spammers are insidious.  Just when you think you have them filtered out, they alter their domain name or IP address just enough to skirt past the filters and go about their spamming.  Updating filters is an ongoing chore, but necessary.

Now for the bad news: there is an ever-growing segment of spam referrals that, no matter how thoroughly you build your filters, you cannot screen out.  Why not? Because they actually never visited your site at all.  These are the Ghost Referrals, and it can feel like chasing ghosts trying to keep your data free of them.

Ghost Referrals work by using brute force programs to mimic Google Analytics Tracking IDs, running through every possible combination that fits the UA-XXXXXXXX-X pattern that Google’s tracking IDs follow.  Eventually they happen onto a valid ID, and use it to fake a pageview directly to Google’s tracking service.  Google’s tracking cannot differentiate between real and fake pageviews; it only looks for valid IDs.  When it sees a session with your Google Analytics Tracking ID, it tallies it in your data reporting with whatever referral source the spammer has spoofed.  Your server level and account level filters are useless because the spammer never actually arrived to trip them!

Who You Gonna Call?

I gave you the bad news, so now here’s the good news:  you can guard against the Ghost Referrers, too, and it’s not overly difficult to do.  You simply have to reverse your thinking.  Rather than trying in vain to filter out the bad traffic, create a filter that only allows the good traffic to come in.  You can do this using the Hostname dimension in Analytics.

Under Acquisition in the Analytics menu, find All Traffic > Referrals.  Set your Secondary Dimension to Hostname.  You’ll quickly see a list of the servers through which the referrers claim to have arrived at your site.  Those that actually did will show your server’s name (often identified by your URL) or the name of a server you recognize as valid.  For example, if you use a third party such as Volusion for the shopping cart on your site, you might see traffic coming through the hostname “volusion.com.”

The ghosts never visited your site, so they do not know the correct hostnames to use – and that is their fatal flaw.  Instead, they either use a fake hostname or report the hostname as “(not set).”  All you need do, therefore, is design your filter to only include traffic whose reported hostnames match those you know to be valid.  Just like that, the ghosts will vanish!

One caveat to this method: always, always make sure to have set aside one view for each account that you leave unfiltered.  If you mistakenly filter out good traffic and have no unfiltered view in place where it would have been captured, it will be lost forever.

If you’re running into issues with spam referrers distorting your site data, or even if you just don’t feel comfortable setting up Analytics or other SEO tools to help manage your website performance, start a conversation with us at Synapse today.

This entry was posted in News. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *