Google’s most recent algorithm change was targeted specifically at content farms — the kind of sites that have lots of crummy low-quality content and exist solely to make money off of ad revenue. Often these kinds of sites would employ good SEO and outrank legitimate information sources on Google with their mostly meaningless articles.
Google’s Matt Cutts and Amit Singhal recently talked about this algorithm change with Wired, where we learned that this algorithm change was known internally at Google as “big Panda.”
How Google Identified Content Farms
One of the big questions with this algorithm change was how exactly Google identified, algorithmically, which sites were considered content farms. This is of particular interest not just because many sites are reporting being innocent casualties (which always happens, and most of them are not) but because one of the most well-known sources of useless content farm information, eHow, was not nailed by the rankings.
Google shed a little light on the process in the interview. They used outside testers as a method of determining what a “low quality” site was. These testers had a document with a series of questions to fill in on each site.
Then we asked the raters questions like: “Would you be comfortable giving this site your credit card? Would you be comfortable giving medicine prescribed by this site to your kids?” … “Do you consider this site to be authoritative? Would it be okay if this was in a magazine? Does this site have excessive ads?” Questions along those lines.
From that data, they were then able to put sites into groups — these are high-quality, those are low-quality — and then with big Panda they attempted to identify what was different on-page about those groups and alter the Google algorithm to detect them.
And we actually came up with a classifier to say, okay, IRS or Wikipedia or New York Times is over on this side, and the low-quality sites are over on this side. And you can really see mathematical reasons …
And what about eHow, the notable content farm that seemed to be looked over in Google’s content farm algorithm change? Singhal had one thing to say that may be an indication that eHow slipped by because the structure of their content farm was just blurry enough to slip past the algorithm change:
However, our classifier that we built this time does a very good job of finding low-quality sites. It was more cautious with mixed-quality sites, because caution is important.
Ads to Content Ratio Seems to be a Factor
One of the things we’re pretty sure is affecting this new big Panda Google algorithm is the amount of ads on a page — or possibly the content to ad ratio. And let’s be honest — that makes perfect sense. When you go to a site that is plastered with ads everywhere, and you have to scroll down just to find the first paragraph of content, and then there’s more ads in between each paragraph — you and I do not consider that a high quality site. In fact, intuitively, we all know that’s the first sign of total garbage on the internet.
This is not, as some people contend, an issue of Google punishing Adsense users. It is instead Google punishing sites that are low quality and with a poor user experience, and Google is willing to punish even those sites that are bringing in Adsense revenue.
These are sites that seem to see the Google Adsense heatmap and completely misunderstand it:
This map is indicating how often ads are clicked on, statistically, based on the ad position. This map suggests that an ad above your content is probably going to be clicked on more than an ad to the right of your content.
This is not suggesting that Google Adsense users should put an ad in every one of those positions! It is not recommending that they should put an in every orange box. But this is exactly what you see low-quality content farm sites do.
There may be a few sites with excellent original content who construct their site and ads this way, but I think we can all agree that we’d rather see fewer sites like this out on the internet. I for one will be happy to see less of them when I’m clicking on Google search results.
Other Panda-Related Posts:
[posts-by-tag tags = “panda” number = “10” excerpt = “false” thumbnail = “false” order_by = “date” author = “false”] [/posts-by-tag]