In recent months, surveys have confirmed that for a rapidly expanding user base around the world, artificial intelligence (AI) summaries of web content — in Google’s AI Overviews or Perplexity — have emerged as AI’s primary use case, its killer app.
Adoption of these tools has surpassed 50 and 60 percent of users in the United States and the United Kingdom, respectively, and trust in AI overviews hovers between 49 and 60 percent in the United States and Japan, respectively. OpenAI reports that roughly half of the queries people run on ChatGPT now involve search.
But the revolution in search that AI summaries have brought about is making a substantial dent in traffic to sites on the web. Although the stats are contested and the precise picture is unclear, the general trend is undeniable. With AI Overviews becoming a fixture on Google, and a new generation of browsers with built-in chatbots rolling out (Comet, ChatGPT Atlas), more of our online information gathering is beginning and ending with AI. And web publishers are up in arms about it.
Courtroom Battles Over AI Search
One front of the battle is litigation in the United States and the European Union by, among others, the Independent Publishers Alliance, educational content provider Chegg and Penske Media, owner of Rolling Stone, Billboard and Variety. They allege that Google is compelling them to provide content for its AI Overviews in exchange for search visibility, abusing its dominant position as a search engine. Other actions against Google and Perplexity by Encyclopedia Britannica and the New York Post allege copyright infringement for the use of “full or partial verbatim reproductions” of the plaintiffs’ content in summaries.
The lawsuits are likely to fail in their aim of putting the genie of AI search back in the bottle because they rest on a mistaken premise. Summaries are not inherently a form of infringement, and sites that provide information sourced in an AI Overview are not entitled to traffic for having produced it. Summaries that do not substantially copy original content will likely qualify as “fair use” under most copyright regimes because their primary purpose is for research, reporting or commentary.
The new protocol would signal a preference to bots crawling the web not to scrape a site for AI model training or use in AI search overviews.
The Battle at the Internet Engineering Task Force
With the lawsuits likely failing, publishers are waging a battle on another front — one that has garnered less attention in mainstream media. Over the past year, content creators have been attempting to establish a new set of standards and protocols to be incorporated in the underlying code of websites. The theatre of this battle is the Internet Engineering Task Force (IETF), an international body that sets standards voluntarily adopted by sites and platforms large and small. The aim is to agree on a new protocol similar in nature to robots.txt, which, for many years, has allowed sites to opt out of snippet view in Google search results. The new protocol would signal a preference to bots crawling the web not to scrape a site for AI model training or use in AI search overviews.
But an extensive recent study by researchers at Duke University has shown that, over the years, Google and other platforms have only partially complied with the robots.txt protocol. It also found that “certain categories of bots, including AI search crawlers, rarely check robots.txt at all,” raising doubt about whether any new protocols will be that effective.
Despite this, publishers are making their case at the IETF, and companies like Cloudflare and Mozilla — which provide backend services to content producers and thus have a stake in web traffic — have taken their side. They emphasize how much harder it has become to generate traffic to sites since AI Overviews were introduced. They support the adoption of more nuanced protocols that would signal to Google and other AI providers a preference to be scanned for inclusion in search results but not summaries. Google, Microsoft and others argue that it is impossible to separate these purposes at this point, since, for years, AI has played an integral role in how search indexing and ranking work.
An engineer at Mozilla captures the publishers’ position: “If you’re producing a summary that is not intended to direct people to the original source of the content, then that’s off limits. You’re not providing a search application, you’re providing something else.”
Why This Too Is a Lost Battle
Setting aside the question of what practical effect these protocols might have if the IETF were to adopt them, the preference that publishers seek to express through them is premised on a mistaken sense of entitlement to control mere information. There is no property right in facts, and regardless of how much traffic AI summaries divert from original sites, nothing entitles content creators to prevent AI platforms from summarizing facts to be found on the open web.
The confusion on the part of publishers lies in the notion that because they enjoyed a reciprocal relationship with Google before, they should continue to do so as search evolves — or they’ll walk away and take their content with them. Previously, in exchange for giving Google snippets or links, they received steady traffic. Today, they’re giving up valuable data not only without getting anything in return but also while losing traffic.
Publishers now have little leverage, and they’re trying to create it by forging a new power to opt out of summaries through a new protocol. But just because they benefited from cooperating with Google and other large platforms in the past doesn’t mean they’re entitled to the same arrangement today.
Even if Google’s AI Overviews or ChatGPT’s summaries rely on information that publishers have created and divert traffic in doing so, this doesn’t give publishers a moral or commercial right to limit what Google or OpenAI can say. A person who offers a summary that doesn’t substantially reproduce the original text — one that merely paraphrases information to be found at a given source online — is not doing something others can prevent. At the end of the day, that is what summaries are: a form of platform speech. Research, commentary in some cases, but not theft.
The initiative to formulate new preference protocols at the IETF is fundamentally misguided. It’s an effort to impede an important innovation that offers a massive net benefit to countless users around the world, without doing anything inherently wrongful. AI summaries are forcing publishers to adapt, as the shift to digital media did 30 years ago — a shift that allowed millions of creators in new media to thrive.
There is no returning to a world without AI summaries in some form. If you put content online, others are free to paraphrase or quote it. Overviews will be permissible to at least some extent. Change for publishers may be costly and unpleasant, but some are beginning to adapt — by pivoting away from search engine optimization and referral traffic toward building direct relationships with their audiences. And though only time will tell, some of them may be better off as a result.