If you boil Blodget’s 4,000 words down to a single idea, it’s basically this: over-aggregation.
Now the concept of over-aggregation is not well defined, and means different things to different people. To Ryan McCarthy, who used to work at the Huffington Post and is acutely attuned to such things, over-aggregation is what happens when Outlet A writes a story and then Outlet B basically rewrites or copies the story so that there’s no reason to click through to A any more. HuffPo and Business Insider have both been accused of this, as have sites like Newser.
But that’s clearly not what was happening with Marco’s posts, so let’s put that kind of over-aggregation to one side for the moment. The dispute between Marco and Business Insider relates to something different — which is what happens when TBI links out directly to other people’s blog posts.
Now I’m a great believer in linking out directly to other people’s blog posts: I’ve built an entire website which does nothing else. And Counterparties.com doesn’t just have external links, either: each link also comes with a dedicated permalink, like this one.
But here’s the thing: we build Counterparties.com by hand, we write every headline on the site, we add a tag to it, and so on. What you see on Counterparties is our unique content. It links to other sites, but it doesn’t copy anything from those sites. And we link out maybe 20 or 30 times a day, tops. This is not some kind of copy-and-linking robot algorithm, it’s a hand-built list of artfully curated links.
At TBI, by contrast, the areas of the site with nothing but external links work very differently. There are two such areas: one’s a column called “Read Me” which appears on the right hand side of the page if you scroll down a bit, and the other is a dedicated section called “The Tape“. For readers navigating the site, both of them work as they should: you see the headline, you click on the link, you go straight to the other website.
But behind each of those links is a huge CMS (content management system) architecture, whereby every external link is generated from a dedicated permalink page which people navigating the website are never supposed to see.
If you go to Yahoo Site Explorer, it’ll tell you that TBI has — get this — 465,825 separate pages. Now the likes of Henry Blodget and Joe Weisenthal are undeniably prolific, but there’s no way you get to 465,825 pages manually. TBI is about four years old, if you go back to its first incarnation as Silicon Alley Insider; 465,825 stories over four years works out at well over 300 stories per day.
So most of those pages, it turns out, were generated by robots without any human input at all: they look like this, or like this, and they’re just pages which copy-and-paste the headline, the author, and some of the content from third-party websites.
According to Blodget, this huge mass of robo-pages at TBI has an entirely innocent explanation. “To put something into the ReadMe box,” says Blodget, “we need to have a page with the headline and sub-head and author on our site, even if the page will never be seen by our readers.” It’s just a technical necessity! Nothing nefarious about it!
To be honest, it’s not a technical necessity. Other sites which link out a lot — Drudge, say — don’t have millions of hidden permalink pages generating every link on the home page. And Blodget protests a bit too much, I think, when he says he gets no googlejuice from these pages:
In the past, these pages have been indexed by Google, but because they include a link back to the originating site and page, they do not generate much (if any) SEO value for us. They exist only because it was easier for our developers to use the existing post-headline-author metaphor in our publishing system than to create the Tape entirely from scratch
We always include a link to the original post on this stub page, so Google won’t conclude that we produced the original story.
I don’t think that Blodget is trying to get Google to link prominently to his stub permalink pages; nor is he trying to fool Google that those pages constitute original TBI content.