Oh Semalt. What are you doing now?

My exposure to Semalt is pretty much limited to filtering their crawlers from websites and removing their referrer from Analytics so my data isn’t being uselessly polluted with visits that never happened.  I’m sure that some people probably get some benefit from them, but given that their website doesn’t actually say what they do, it’s difficult to give a reasonable opinion.

What is clear is that a lot of people don’t like them.  And that’s understandable.  Once you’ve managed to block the fake referrals from Semalt, you then find that you’re troubled by other ones like Buttons for Websites or Kamba Soft.  It’s frustrating, and it’s not really helped by the constant identikit responses from Nataliya their Twitter rep:

The reasons most people don’t complain about search engines’ bots crawling their sites is because:

  • They don’t appear in their analytics reports
  • They provide traffic in return for your data.  For free.

There seems to be another problem with Semalt though.  Here’s a pic of the Google results for a search of site:semalt.com.  Notice anything?


It looks as though Semalt caches a version of their client’s site on a unique sub domain.  I can think of a tonne of reasons why they might do this.  Stuff like usability testing or page analysis.  I can’t think of a single reason why that page should be accessible to search engines.

I know what you’re asking.  And I’ll be honest, I was wondering the same thing.  What happens if you click the link?



You get redirected to Semalt’s home page – the one that doesn’t tell you what they do.  Also, as an aside, the grammar here really boils m’piss.  Not quite as much as adverts that encourage me to “shop the latest styles”, but a lot.  Anyway…

That’s now what Google sees though.  If you look at the cached versions of all the mysterious sub-domains in Semalt, you see actual real websites there…


I searched the Google to see if Semalt pages were appearing in the search results for the brand names of the companies that they mirror.  They don’t seem to.  I haven’t searched exhaustively, because quite frankly, I’ve got better things to do.

So, is this a problem?

I can’t help thinking it might be.  These pages are after all duplicates of another website.  If you look closely at the cached version, it says that it’s Google’s cache of the website at its actual domain name, rather than the Semalt Sub Domain whereas the page uses a 302 redirect to the Semalt home page.  That sort of suggests a conditional redirect to me.

Which is generally a bit of a no-no.

In fact, depending on how charitable a you were feeling in your judgement of this, you might say the following:

Duplicate content

Sneaky redirect

Feels like


Oh, if you’re bothered by Semalt appearing in your referrer list in Analytics, there’s a plug-in for it that you can find on Rishi’s blog