Reddit filed a major lawsuit on Wednesday against artificial intelligence startup Perplexity, marking a pivotal moment in the escalating battle over who controls—and profits from—user-generated content in the AI era. The San Francisco-based company, valued at approximately $20 billion, stands accused of orchestrating an elaborate scheme to bypass Reddit's digital protections and scrape millions of user comments without permission.​

The Allegations

The complaint, filed in Manhattan federal court, doesn't just target Perplexity. Reddit also named three data-scraping intermediaries: Lithuanian firm Oxylabs UAB, Texas-based startup SerpApi, and AWMProxy, which Reddit characterizes as a "former Russian botnet". According to the lawsuit, these companies created what Reddit's chief legal officer Ben Lee calls an "industrial-scale 'data laundering' economy," where scrapers bypass technological protections, steal data, and sell it to AI companies desperate for training material.​

The most damning evidence? Reddit claims it sent Perplexity a cease-and-desist letter in May 2024, demanding the company stop scraping its data. Perplexity reportedly responded by promising to respect Reddit's robots.txt file and insisted it wasn't using Reddit content to train AI models. But the lawsuit alleges that citations to Reddit in Perplexity's search results increased "forty-fold" after that warning—a staggering contradiction that Reddit says proves the AI company's deception.​

"Rather than respect Reddit and its users' rights, what Perplexity has done in response is simply come up with increasingly devious schemes to circumvent Reddit's security systems and policies," the lawsuit states. Reddit alleges Perplexity used third-party scrapers to obtain content indirectly through Google's search engine results, effectively laundering the stolen data through multiple intermediaries.​

Why This Matters

Reddit's content represents one of the internet's most valuable repositories of authentic human conversation—exactly what AI companies need to train sophisticated language models. While tech giants like Google and OpenAI have signed licensing agreements with Reddit to access this data legally, Perplexity allegedly chose a different path.​

This lawsuit represents Reddit's second major legal action against an AI company this year, following a similar complaint against Anthropic in June. But this case breaks new ground by exposing the entire ecosystem of data scraping services that AI companies rely on to circumvent protections.​

"AI companies are locked in an arms race for quality human content—and that pressure has fueled an industrial-scale 'data laundering' economy," Lee explained. The lawsuit seeks unspecified monetary damages and a court order preventing Perplexity from further exploiting Reddit's data.​

The Bigger Picture

As artificial intelligence transforms how we search for and consume information online, questions about data ownership and fair compensation have become increasingly urgent. Reddit's aggressive legal strategy signals that content platforms won't stand by while AI companies build billion-dollar businesses on the backs of user-generated content.​

For American businesses and consumers alike, this case could establish crucial precedents about digital property rights in the AI age—determining whether platforms can protect their communities' contributions or whether anything publicly visible online is fair game for AI training.​