Sign up for the daily CJR newsletter.
In late December of 2023, when the news emerged that the New York Times had filed suit against OpenAI and Microsoft for copyright infringement, Bill Gross was at home in Pasadena, feeling somewhat indignant. Gross, who is sixty-six, is a grandaddy of the internet: in 1998, he founded GoTo.com, which introduced the idea of allowing advertisers to bid for placement in search engine resultsâan auto company, for instance, could bid to appear high up in the results for âbest car.â GoTo provided an effective, lasting strategy for monetizationâone that made advertising through online search engines, rather than news publishers, highly appealing. Gross later left search technology to seek solutions to climate changeââmy true passion,â he told me. But now something big was happening, and he couldnât look away. âThis is the first firing shot of a serious organization that has enough money to fight back,â he remembers thinking. âBut how are all the individual, small creators going to win?â
OpenAI had debuted ChatGPT to the public about a year earlier and as its popularity surged, along with that of other generative AI products, Gross recognized that the problem for online publishers, including news outlets, went beyond the fact that their materials were being used to train large language models. The introduction of tools such as ChatGPT, which generate concise answers after unceremoniously scraping the internet for material, rather than producing a series of blue links to websites, meant that traditional search engines were bound to decline. âYou can type in your search and get your answer directly, which is a superior experience to getting breadcrumbs,â he said. âWho wouldnât want to get their answer directly?â That would, of course, divert precious Web traffic from the original publishers of ransacked content, and, as Gross knew as well as anyone, fewer clicks would mean less advertising revenue. âItâs infringing on peopleâs creativity and on their copyright,â he said. For the news industry, this could spell doom.
So Gross, a problem-solver by nature, began mulling an idea. What if there were a way for publishers to receive attribution, and ultimately compensation, every time a chatbot surfaced material from their sites? âI thought, âWhat can we do to make a fair business model in this new era of better answers?ââ Gross said. Then a solution came to him: large language models, the source of the problem, could also be the answer. On January 4, eight days after the Times case got off the ground, he walked into his attorneyâs office and filed a patent for the technology that would become his latest company: ProRata. The need, he felt, was urgent: âCreativity and democracy are at stake,â he said.
ProRata, Gross told me, analyzes the entirety of an AI-generated answer and weighs how much of it came from various sources. It does so by comparing each individual statement a chatbot has made against a dataset of source material, used with the permission of a network of publishers, on which he trained his model. ProRata then produces a breakdown of the percent of each answer that came from a given website or document, factoring in what source reported the information firstâ20 percent might come from The Atlantic, 5 percent from the Financial Times, and so on. The technology mirrors anti-plagiarism software used by college professors. âAnd that whole thing can be done in a hundred milliseconds,â he said.
Key to ProRata, and true to Grossâs background, is monetization. The system is designed to run targeted adsâwhich are clearly differentiated from the answers themselvesâwithin its interface. Half of the revenue generated by these ads goes back to ProRata, and the other half is distributed among the sources from which the answer drew. Currently, tools such as ChatGPT operate at a loss, earning money through subscription fees. But Gross believes that when they do start running adsâitâs only a matter of timeâadvertisers will be willing to pay a premium because AI-generated queries are more specific and detailed than old-fashioned search engine fodder, and that allows for more direct targeting.
Gross had no trouble persuading publishers to get on board. One of the first was Nicholas Thompson, the CEO of The Atlantic. Last year, The Atlantic signed a licensing deal with OpenAI for use of its archivesâbecause that, Thompson told me, was better than nothing. Of the five prominent large language model companies that have used The Atlanticâs data, he pointed out, only OpenAI had agreed to compensate it in any way. But when Gross approached him with his idea for an alternative, Thompson was intrigued. âMy second-favorite solution is to have the LLM providers compensate the media industry for what they did and for the value weâve provided,â as OpenAI is doing, Thompson said. âBut my favorite solution to this problem is to have something like ProRata technology and architecture underlying all of LLM solutions,â he added. âTheyâre solving this at the root level.â
He quickly agreed to partner with ProRata, sharing The Atlanticâs archive and volunteering to serve on the companyâs board. Today, The Atlantic is one of more than five hundred publishersâFortune magazine and the Financial Times among themâthat have agreed to allow the company to train its model on their archives.
Thatâs given ProRata a busy start, though Gross doesnât expect to transform the system overnight. He remembers that when he first introduced his idea for paid search, Yahoo immediately came out to say that the company would never adopt it. Two years later, Yahoo struck a hundred-million-dollar deal with him. Two years after that, Yahoo acquired GoTo. Which is to say: in his experience, even the kind of change that seems most unlikely can happen with time. âIâm in this for a decade-long effort to shift peopleâs views,â Gross said. âI think that long term, the right side of history will be protecting creative rights.â
Depending on the outcome of the Times suit, companies may be compelled by law to adopt a system like the one Gross has developed. In the meantime, ProRata is operating a chatbot, called Gist Search, that compensates publishers (including, Gross noted, Wikipedia) when their content is surfaced. But for ProRata to really make an impact, it would need to be integrated by the dominant large language models. Gross can see a world where publishers increasingly band together to block the Web crawlers that LLMs rely on to harvest data. If they did this, he said, AI companies wouldnât be left with enough material to work withâand that might lead, say, OpenAI to ProRata voluntarily. Whether thatâs possible, though, is uncertain, since many media companies have already struck deals like The Atlanticâs, and there is evidence that Web scrapers donât always respect the line of code meant to block them. âThereâs no question itâs a battle,â Gross said.
I asked him why, after pivoting to saving the Earth, he was willing to come back to digital publishing. âCreators need to have a reason for doing their creative work,â Gross said. âIf that goes away, if thereâs no incentive, then everything will all turn to AI slop.â He paused for a moment. âWeâll all be AI slop.â
Has America ever needed a media defender more than now? Help us by joining CJR today.