on the job

Think You Know Your Web Traffic?

Think again. The scramble for online measures
April 3, 2008

If you hopped into a time machine that spat you out sometime between 1996 and now, you could almost pinpoint the year by the words used to describe an organization’s Web traffic. Hits? That would be 1998 or so. Page views? 2003-2005. Unique visitors? 2006-2007. Odds are that 2008-2009 is going to be the year of “time spent,” as in, “an average user spends four minutes and thirty-five seconds on our site.”

It’s reasonable to assume that the migration to online news would have given organizations an easy and precise way to calculate their Web readership. But the truth is we don’t even know what to count. And as online advertising grows, getting a credible traffic measure on which to base ad rates is becoming ever more important.

The statistics most publishers are familiar with are census data pulled from Web servers. These numbers, which count how many times a site is called into action through the Internet, are like red meat to a tiger—publishers love to pounce on them and proclaim they “know” their traffic. When Daily Green, the Hearst Corporation’s first online-only publication, launched in November 2007, its creator and editor-in-chief, Deborah Jones Barrow, was constantly tracking these census numbers, looking for validation. “We can put a story up and can tell whether it’s getting traction by our Omniture numbers within a half hour,” Barrow says, referring to a Web analytics tool that is based on server data.

While they seem accurate enough, these server data include the “bots” and “spiders” that search engines send out by the hundreds to crawl the Web and index content for searches. These “readers,” as they are collectively known, give us the power to effectively search the Web, but they also inflate the number of visitors counted by internal servers. And the biggest inflation factor in these data is “cookie deletion.” “Cookies” are small pieces of code that a Web page injects into your browser so that the page will recognize you when you revisit it. This way, the server hosting the Web page won’t count you as a new visitor. But cookie deletion by users has become so common that it increases a site’s traffic by 150 percent, counting the same visitor as three or four different visitors in a single month, according to Andrew Lipsman, a senior analyst at comScore Inc., a global Internet information provider. If you think you have ten million visitors, it’s probably closer to four million, he says.

With such broad discrepancies, there’s little chance that a savvy advertiser will agree to rates based solely on server data.

These server numbers are usually balanced with numbers from organizations like comScore and Nielsen/NetRatings, which find panelists through a random digit-dial process who agree to have their every Web move monitored, just like the “Nielsen Families” that are used to determine TV ratings. For a fee, these companies will share traffic numbers with a site within a plus-or-minus 2 percent error margin, according to Mainak Mazumdar, the vice president of measurement science and panels at Nielsen/ NetRatings.

Sign up for CJR's daily email

Plus or minus 2 percent sounds pretty definite, but if server data is guilty of overestimating traffic, these sample-based numbers are vulnerable to the charge of underreporting. For one thing, their panelists are drawn from people who have land lines, but at least 25 percent of young people (ages eighteen to twenty-nine) today use only cell phones, according to a 2007 federal survey, and these cell-only folks are arguably more representative of the Web-surfing public. Furthermore, these panels don’t account for international traffic or traffic from work computers, which generate a large percentage of hits for sites like cnn.com, according to Jack Wakshlag, the chief research officer for Turner Broadcasting.

“This may be the most measurable medium in history, but the measurements all suck,” says Steve Yelvington, an Internet strategist for Morris DigitalWorks, which manages the Web sites for more than sixty newspapers and thirty radio stations.

As online advertising balloons into a $20 billion market, according to the Interactive Advertising Bureau, it’s unclear how ad sellers and buyers can ever agree on rates. “My frustration is that I continue to see undigested B.S. numbers thrown around without any kind of critical examination,” says Yelvington. “When they’re used in sales presentations, they set us up for failure by creating expectations that aren’t realistic.”

Advertising is the main revenue source for online news organizations, but without a reliable and agreed-upon means to count eyeballs, why would any national advertiser opt for a news organization over a search giant like Google, a destination they know people swarm to by the hundreds of thousands?

For now, no Web-site operator really knows what her true traffic is. Neither the panel nor the census system is perfect, but by going back and forth between the two, publishers and advertisers can make relative comparisons within their market. That’s how the industry has continued to push forward.

“Sites do come to us with their own internal information, but we are still going to use the syndicated [panel] tools available,” says Julian Zilberbrand, the associate director of digital ad operations at MediaVest, a major online advertising buyer. “The truth will lie somewhere in between.”

But there is some hope for reconciliation between the two standards of counting traffic.

That’s where George Ivie, executive director and CEO of the Media Ratings Council (MRC), comes in. The council is a not-for-profit trade association formed in the wake of the 1960 Harris Committee Hearings. The committee investigated the business practices of radio and television and concluded that the broadcast media would self-regulate, including performing independent audits to determine the size of its audience. The ratings council was created to set standards for measuring broadcast audiences and to accredit organizations, like Nielsen, that measure those audiences. “We haven’t changed that much since the 1960s in that we have a focused mission—to try to improve for the industry the quality of media measurements,” says Ivie.

In 2002, the council turned its attention to the Internet, and has found it to be a different kind of beast entirely than TV and radio. The first thing that it did was write standards for audience census data, which include page views, clicks on a page, and time spent on a site—all the new types of audience measurement born with the Internet.

News organizations that want their traffic counts to be accredited by the ratings council must first adopt the council’s standards, which include, for example, counting only content that is accessed through an end browser. (Some organizations would count URLs sent through e-mail as a page impression, without actually verifying that the site was ever visited through a browser.)

To date, the council’s standards have been agreed to only by its 108 affiliate news and advertising organizations. And only a handful of these affiliate organizations—including Yahoo, msn, Weather.com, Univision, AOL, and advertisers like Atlas and DoubleClick—have been audited by the MRC, giving these operations an advantage when negotiating advertising rates.

More important, though, the MRC began auditing comScore and Nielsen/ NetRatings in 2006, a five-phase process that won’t be complete until later this year. Ivie says this will allow a reconciliation between panel- and census-based traffic estimates. “When we are done with the auditing, I’ll tell you how I feel about their random-digit panels,” says Ivie. “How do we know they are representative of the U.S.? Right now, I don’t.”

One concern, however, is that before the auditing of comScore and Nielsen/ NetRatings is complete, the volatile online world will produce a new silver bullet in traffic metrics.

Last July, for instance, Nielsen/Net-Ratings decided that time spent on a site was more important than total Web page views, an acknowledgment of changing online habits that include watching video (which can keep readers on a site longer) and Ajax, computer coding that allows a visitor to interact with a Web page—voting, tagging, or moving parts of the page around—without having to reload an entirely new page. At the same time, having a window open on a site for hours doesn’t mean the visitor is “engaged” the whole time—he could have simply gone to get a cup of coffee—which makes “time spent” an incredibly tough metric to measure accurately.

In a few years there could be an entirely new way to describe how news organizations measure their online relevance. If that does turn out to be the case, the Media Ratings Council and the news industry are, once again, going to have more catching up to do. But that’s the nature of an evolving medium whose future is unpredictable. 

David Cohn is a student at Columbia’s Graduate School of Journalism.