Recognizing Spam Comments and Junk Sites
Right off the bat, I’m going to tell you that this post is more geared toward newbie bloggers. Established bloggers may learn a thing or two, but for most bloggers that have been around for while, there isn’t anything new here! Unfortunately… and sadly.
Once you have been blogging for a while, you’ll learn to quickly identify spam comments and junk sites (spam sites, scraper sites, etc.), but for a new blogger, it can sometimes be a challenge. You can’t be sure whether it’s some kind of bot or someone that doesn’t speak English very well.
Of course, one thing you can do is visit their site. And, if you do that, you’re doing one of the things the spammer wants anyway. But, while you may be hesitant to visit a spammy site, it is preferable to the other two alternatives: (a) allowing the comment and giving a spammer a link or (b) deleting a comment from a real person.
In all cases, it’s best to recognize a comment as spam even without visiting the site. Fortunately, the majority of spam comments make this easy.
Spam Comments
Note that using something that automatically blocks spam comments, such as Akismet, a lot of these things will be blocked without any intervention on your part.
First of all, any comment which has a simple statement, such as “Your blog is very well-structured. I enjoyed browsing here. I will return expediently again.” followed by a long list of URLs is almost certainly always spam. Delete it.
Any comment which isn’t a comment but just a list of links is spam. Delete it.
Comments which don’t seem to relate at all to the topic of your post are almost always spam. You have to be somewhat careful here, as it is possible a person might have clicked the wrong link to leave a comment and actually intended to leave a comment on a different post. If you find that happens a lot, a re-design of your blog might be in order!
Comments which just mention certain little blue pills or other brand names or popular drugs with a link to the site are spam. They will also commonly bold the brand name, which makes them even quicker to identify as spam. Delete them.
Comments which may be lengthy but have about every other word as a link to some website are also almost certainly spam.
The vast majority of spam will fall into one of those five categories. So, if you’re not using a spam comment blocker, it makes it a little easier to quickly scroll through comments and clear out the spammy ones. And, even with blockers, there will still be ones that slip through, and you can just as easily identify them as spam.
Junk Sites
Sometimes you’ll see trackbacks and linkbacks to your posts that contain some of the same text as your post. If you’re lucky, another blogger has found your post to be worthwhile and has copied a small excerpt and linked back to your post.
Many other times, it will be a junk or scraper site. Some sites do nothing but post small excerpts and link back to your site, which is not necessarily a bad thing. Of course, they hope to become recognized in a niche and get some advertising revenue out of the deal. These links aren’t too beneficial though, as most people don’t read those types of sites anyway. If they are linking to your site and copying only a small excerpt from your post, it’s generally best to let it be. Don’t allow a link back to them, but don’t ask them to remove the link to you either.
Other sites will more extensively copy your content, sometimes grabbing your whole post. Sometimes they may link back to you. Many times they will not. And, still other times they won’t even credit you as the author. Instead, they will just copy your content and claim it as their own. The former is copyright infringement and the latter is plagiarism. Unless you have released your post under some kind of open license that permits people copying and sharing your post, you can ask them to remove your content. And, if they didn’t even credit you, unless your license permitted that, you can ask them to remove your content. If they don’t comply, you can pursue legal options or, at the very least, you can look into DMCA at Google and get their website booted out of Google’s index.
Still other sites will copy excerpts from your site and may or may not link back to you (but a linkback is the most common way you’ll learn about them) and rewrite your excerpt in such a way that it makes no logical sense but is stuffed with keywords. Take this excerpt from my Putting Together an eCommerce Site post:
If you want to look into getting your own merchant account right at the start and handling credit card transactions directly, rather than through a third party, here are some of the more well-known payment gateway services. Some of them also have resellers, so you might also want to look for those resellers, who may be more familiar with different shopping cart solutions and better aid you in integrating your shopping cart with the payment processing system and making sure it meets PCI DSS requirements:
Now, that may be “rewritten” to something like this:
If you desire to view into obtaining a merchant account provider of your own correct at the beginning of small business e-commerce and handling credit card processing transactions directly, instead than through party of three, here are a few of the more famous payment gateway services of ecommerce. A handful of them too possess resellers, so you might too desire to view for those resellers of merchant account providers, who can be greater similar with varying shopping cart applications and better medicate you in merging your shopping cart with the payment credit card processing system and producing certainty it converges within PCI DSS standard deviations.
Makes no sense for the typical reader, but is something search engines may like. You can’t even tell it was something you originally wrote. Typically such rewrites are done using scripts that just replace words and phrases. Still, technically, it is copyright infringement but it’s more difficult to do anything about. Sometimes, you can’t easily tell if it was copying your work and it’s not confusingly similar to your own. No one is going to see the two copies side by side and think the keyword-stuffed-but-gibberish version is the better quality version. Heck, it may even qualify as parody. In any event, you don’t want to link out to such sites, so be sure to delete any incoming trackbacks from them that might otherwise get posted as comments on your blog.
Anyway, those are probably the most common of spam comments and spam sites out there. Again, as you become more experienced in the blogging world, these things become easier to spot. Which is kind of sad if you really think about it.



I’ve notice a new rage of “madlib” sites where the black hat seo owners are even smarter than ever. They will create a completely readable site which is scraped from places like wikipedia. Many use a program like the one found at datapresser.com, using more complex strings to formulate hundreds of sites. They will not have any advertising on them at first. Six months later, after building massive back links using a keyword anchor text (mostly through automated commenting and cloaked interlinking), they will load the sites up with ads and run them on autopilot.
There’s a big difference between a spammer and a smart black hat seo. I found older comments on my blog where the site linked to actually changed their content to porn after the fact. Very deceptive indeed
I found older comments on my blog where the site linked to actually changed their content to porn after the fact.
A good reason to switch off comments after a set period of time. Which is something I’ve been meaning to do but haven’t yet. I can be such a hypocrite sometimes.
I had comments set to go off after a certain amount of time, but I have since stopped using that plugin. Sometimes you do get legitimate comments on old posts. Don’t forget that sometimes you might want to highlight an older post and people might want to leave fresh comments on it.
Turning off comments is sort of like admitting defeat to the spammers. I say we fight for victory!
Good point, maybe that was my reasoning behind not turning off comments to begin with.
Bear in mind too that porn sites will often buy expired domains. So, in some cases, it’s possible that the original blogger let their domain name expire (or sold their blog) and a porn site operator snapped it up.
Most people don’t realize how valuable an expired domain is, especially on with a PR4 or better - it happens all the time. Also, if you know where to look, you can buy old databases (really large ones) loaded with information. Most of those local area dating sites from highly populated cities are built around them - in a bad way. It’s amazing how governmental agencies often sell the databases and don’t actually clean them. You could buy a used server from somewhere loaded with customer names and private information.
I occasionally use bad-neighborhood.com to find old comment links on my site pointing to porn sites. 99% of the so-called bad sites are not bad, they may have a naughty title or cursing in it’s content, so I don’t worry about them.
Nearly all the spam I receive is sent via some program that allows the miscreant to bypass the usual procedures. Akismet still picks it up, I delete it all, and yet they still bombard me several times per day from the same ip address.
I was thinking of naming and shaming those ip addresses but can’t help but wonder if that would put a big target on my head?
I don’t want to shut off old comments either. I have found that my highest
traffic posts are the ones that aren’t frequented by the non blog crowd. I will get
comments (without links) just thanking me for information or clarifying things I
did not know about vintage stuff. So I really don’t want to shut off old comments.
Too I get new things to wrote about that way.
I think Dan is right about pron sites buying old domains… they will buy anything on the
internet. They had a big news story here a couple years ago- where a foreign porn site
bought something like Waukesha Government dot org ( I don’t know if that is the city- but I know it was South Milwaukee).
Anywhoo- something like 75% of the counties first time visitors were getting the porn site first. They were pretty irritated by the deal- I am not sure how they resolved it.