If you hopped into a time machine that spat you out sometime between 1996 and now, you could almost pinpoint the year by the words used to describe an organization’s Web traffic. Hits? That would be 1998 or so. Page views? 2003-2005. Unique visitors? 2006-2007. Odds are that 2008-2009 is going to be the year of “time spent,” as in, “an average user spends four minutes and thirty-five seconds on our site.”
It’s reasonable to assume that the migration to online news would have given organizations an easy and precise way to calculate their Web readership. But the truth is we don’t even know what to count. And as online advertising grows, getting a credible traffic measure on which to base ad rates is becoming ever more important.
The statistics most publishers are familiar with are census data pulled from Web servers. These numbers, which count how many times a site is called into action through the Internet, are like red meat to a tiger—publishers love to pounce on them and proclaim they “know” their traffic. When Daily Green, the Hearst Corporation’s first online-only publication, launched in November 2007, its creator and editor-in-chief, Deborah Jones Barrow, was constantly tracking these census numbers, looking for validation. “We can put a story up and can tell whether it’s getting traction by our Omniture numbers within a half hour,” Barrow says, referring to a Web analytics tool that is based on server data.
While they seem accurate enough, these server data include the “bots” and “spiders” that search engines send out by the hundreds to crawl the Web and index content for searches. These “readers,” as they are collectively known, give us the power to effectively search the Web, but they also inflate the number of visitors counted by internal servers. And the biggest inflation factor in these data is “cookie deletion.” “Cookies” are small pieces of code that a Web page injects into your browser so that the page will recognize you when you revisit it. This way, the server hosting the Web page won’t count you as a new visitor. But cookie deletion by users has become so common that it increases a site’s traffic by 150 percent, counting the same visitor as three or four different visitors in a single month, according to Andrew Lipsman, a senior analyst at comScore Inc., a global Internet information provider. If you think you have ten million visitors, it’s probably closer to four million, he says.
With such broad discrepancies, there’s little chance that a savvy advertiser will agree to rates based solely on server data.
These server numbers are usually balanced with numbers from organizations like comScore and Nielsen/NetRatings, which find panelists through a random digit-dial process who agree to have their every Web move monitored, just like the “Nielsen Families” that are used to determine TV ratings. For a fee, these companies will share traffic numbers with a site within a plus-or-minus 2 percent error margin, according to Mainak Mazumdar, the vice president of measurement science and panels at Nielsen/ NetRatings.
Plus or minus 2 percent sounds pretty definite, but if server data is guilty of overestimating traffic, these sample-based numbers are vulnerable to the charge of underreporting. For one thing, their panelists are drawn from people who have land lines, but at least 25 percent of young people (ages eighteen to twenty-nine) today use only cell phones, according to a 2007 federal survey, and these cell-only folks are arguably more representative of the Web-surfing public. Furthermore, these panels don’t account for international traffic or traffic from work computers, which generate a large percentage of hits for sites like cnn.com, according to Jack Wakshlag, the chief research officer for Turner Broadcasting.
“This may be the most measurable medium in history, but the measurements all suck,” says Steve Yelvington, an Internet strategist for Morris DigitalWorks, which manages the Web sites for more than sixty newspapers and thirty radio stations.