Henry Blodget has a long and detailed response to Marco Arment, which is fascinating to anybody interested in the nuts and bolts behind a modern for-profit blog.

If you boil Blodget’s 4,000 words down to a single idea, it’s basically this: over-aggregation.

Now the concept of over-aggregation is not well defined, and means different things to different people. To Ryan McCarthy, who used to work at the Huffington Post and is acutely attuned to such things, over-aggregation is what happens when Outlet A writes a story and then Outlet B basically rewrites or copies the story so that there’s no reason to click through to A any more. HuffPo and Business Insider have both been accused of this, as have sites like Newser.

But that’s clearly not what was happening with Marco’s posts, so let’s put that kind of over-aggregation to one side for the moment. The dispute between Marco and Business Insider relates to something different — which is what happens when TBI links out directly to other people’s blog posts.

Now I’m a great believer in linking out directly to other people’s blog posts: I’ve built an entire website which does nothing else. And Counterparties.com doesn’t just have external links, either: each link also comes with a dedicated permalink, like this one.

But here’s the thing: we build Counterparties.com by hand, we write every headline on the site, we add a tag to it, and so on. What you see on Counterparties is our unique content. It links to other sites, but it doesn’t copy anything from those sites. And we link out maybe 20 or 30 times a day, tops. This is not some kind of copy-and-linking robot algorithm, it’s a hand-built list of artfully curated links.

At TBI, by contrast, the areas of the site with nothing but external links work very differently. There are two such areas: one’s a column called “Read Me” which appears on the right hand side of the page if you scroll down a bit, and the other is a dedicated section called “The Tape“. For readers navigating the site, both of them work as they should: you see the headline, you click on the link, you go straight to the other website.

But behind each of those links is a huge CMS (content management system) architecture, whereby every external link is generated from a dedicated permalink page which people navigating the website are never supposed to see.

If you go to Yahoo Site Explorer, it’ll tell you that TBI has — get this — 465,825 separate pages. Now the likes of Henry Blodget and Joe Weisenthal are undeniably prolific, but there’s no way you get to 465,825 pages manually. TBI is about four years old, if you go back to its first incarnation as Silicon Alley Insider; 465,825 stories over four years works out at well over 300 stories per day.

So most of those pages, it turns out, were generated by robots without any human input at all: they look like this, or like this, and they’re just pages which copy-and-paste the headline, the author, and some of the content from third-party websites.

According to Blodget, this huge mass of robo-pages at TBI has an entirely innocent explanation. “To put something into the ReadMe box,” says Blodget, “we need to have a page with the headline and sub-head and author on our site, even if the page will never be seen by our readers.” It’s just a technical necessity! Nothing nefarious about it!

