Last week, Google News’s Krishna Bharat spoke at Columbia University about what makes his search engine so helpful and efficient for journalists. Reporters and editors don’t need to spend time thinking about marketing the news they produce to the whole world’s audience, he argued—they can concentrate on the production side, and Google’s algorithm will take care of delivering readers to them. Nevertheless, most online writers and editors do take that extra, buzzword-y step of “search engine optimization” before they publish. Thoughtful, specific headlines help; so do tags and keywords.
But what about when that very effective system gets taken advantage of? Especially in the past few months, there has been an explosion of sites that exist for the sole purpose of waylaying a curious Googler for just a second, enough to win a few page views and flash a few ads in her face. “Scraper sites” or “mirror sites” that copy and paste whole articles from reputable sites into ad-happy spam sites are one problem. Content farms like Demand Media and Associated Content are another. Some Google users have noticed a difference in search quality lately, and they say that smart spammers are learning to game the system, making it harder for the rest of us to find what we want online.
“Google is being infiltrated on a vast scale by content farms,” wrote ReadWriteWeb. “Google has become a jungle: a tropical paradise for spammers and marketers,” wrote TechCrunch. “Google has become a snake that too readily consumes its own keyword tail,” wrote blogger Paul Kedrosky.
Jeff Atwood of programming help site Stack Overflow writes about an annoying phenomenon wherein readers searching for the site’s content on Google would be directed to “scraper sites” or “mirror sites” that had copied-and-pasted the relevant pages onto an ad-happy spam site. Worse yet, the original content wouldn’t show up on Google searches at all, or they would be so far down the list that people would give up. For a similar story—with pictures!—check out this shoe blog post, in which the author searches Google for a specific blog post, with title and source, but doesn’t find it until scrolling through an entire page of irrelevance.
On Friday, Google’s principal engineer Matt Cutts wrote on the company blog that he and his team are aware of the criticism and are working to respond to the challenge. For instance:
The new classifier is better at detecting spam on individual web pages, e.g., repeated spammy words—the sort of phrases you tend to see in junky, automated, self-promoting blog comments. We’ve also radically improved our ability to detect hacked sites, which were a major source of spam in 2010. And we’re evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content.
Cutts previously told CNET that over 200 factors determine where a website will rank in Google’s search results, and that tweaks and changes to the algorithm happen every day. Google has also launched a Google Chrome extension that allows users to quickly provide feedback about spam they encounter online, similar to the “Report Spam” button within the Gmail interface. (Download the extension, learn how to use it, and chuckle at the many spam comments below the post here.) Search Engine Land reports that Cutts has also hinted at giving users the power to block or “blacklist” entire domain names from their Google searches. Would you block Answers.com or eHow.com from your Google searches if you could?
Joseph Tarkatoff at paidContent makes a great point:
Any move by Google that could give less prominence to results from sites like the Yahoo Contributor Network or Demand Media would be a major blow to those companies’ business models, which in large part depend on being ranked highly on the search engine. The timing of the announcement could not be worse for Demand Media, which is expected to go public next week.
- 1
- 2
I've been reading the press comments on Google search spam. Even Paul Krugman has weighed in. I have yet to find one story that indicated that the "reporter" had made so much as an outgoing phone call to gather information. Other than one article in Fortune by Seith Weintraub, the "analysis" consists of regurgitating press releases.
The blog post above does not improve on this standard.
#1 Posted by John Nagle, CJR on Tue 25 Jan 2011 at 01:33 PM
..I did the same search, and the answers.com (fourth best result) answer was actually the *only one* on the first page of SERPS that identified the bird as a "Tick Bird" -- which was the bird you were thinking of.
With that information, you could then look up "Tick Bird" on wikipedia and get a lot more information.
So ... to my mind, the answers.com snippet actually provided value, by creating a link between your search query and the information you were seeking. It is actually the only result on the first page that gets you closer to the answer you were looking for.
#2 Posted by Andrew Boer, CJR on Tue 25 Jan 2011 at 03:46 PM
My husband runs a site that helps teach people how to convert xml to pdf, etc. He says that he gets a lot of spam comments, but ever since he hired a moderator things have gotten better. Google would need about a billion moderators to help their problem.
#3 Posted by Lyla Burns, CJR on Tue 29 May 2012 at 11:31 AM