Home Latest Insights | News Perplexity Accused of Stealth AI Crawling, Cloudflare Warns of “Undeclared Bots” Circumventing Website Blocks

Perplexity Accused of Stealth AI Crawling, Cloudflare Warns of “Undeclared Bots” Circumventing Website Blocks

Perplexity Accused of Stealth AI Crawling, Cloudflare Warns of “Undeclared Bots” Circumventing Website Blocks

Perplexity AI, the rising artificial intelligence search startup often touted as a challenger to Google, is once again under fire for allegedly harvesting content from websites without consent — this time drawing sharp criticism from Cloudflare, one of the largest web infrastructure providers in the world.

According to a report published by Cloudflare, Perplexity’s web crawlers have allegedly continued to access and scrape content from websites that have explicitly opted out of such activity via tools like robots.txt files or firewall rules. The company claims Perplexity’s bots “intentionally obfuscate their identity” and engage in stealth tactics to bypass restrictions, including by masking themselves as popular web browsers like Google Chrome on macOS.

“When we blocked access to our test domains via common methods, Perplexity’s crawlers responded by changing their user-agent and IP address to continue scraping,” Cloudflare said in the report.

Register for Tekedia Mini-MBA edition 19 (Feb 9 – May 2, 2026): big discounts for early bird

Tekedia AI in Business Masterclass opens registrations.

Join Tekedia Capital Syndicate and co-invest in great global startups.

Register for Tekedia AI Lab: From Technical Design to Deployment (next edition begins Jan 24 2026).

Cloudflare further alleges that the AI firm is exploiting rotating IP addresses and altering Autonomous System Numbers (ASNs) — unique identifiers assigned to networks — to circumvent blocks and avoid detection. This stealth activity, according to Cloudflare, spanned across tens of thousands of websites and millions of requests daily.

This is not the first time Perplexity has been accused of bypassing digital boundaries. In mid-2023, the startup was caught indexing content from subscription-based and paywalled media outlets without permission. At the time, Perplexity CEO Aravind Srinivas deflected the criticism, blaming the issue on third-party scrapers operating on the company’s behalf. But now, with Cloudflare’s claims, scrutiny over the company’s data-gathering practices has only intensified.

In response to the latest report, Perplexity spokesperson Jesse Dwyer dismissed Cloudflare’s findings as a “publicity stunt.”  He told The Verge that the blog post contained “a lot of misunderstandings.” Still, the report has prompted Cloudflare to delist Perplexity as a “verified bot” and roll out additional protections that block its scrapers by default.

Cloudflare CEO Matthew Prince, a vocal critic of unregulated AI content harvesting, recently warned of what he described as an “existential threat” to content creators and publishers from AI companies. In June, Cloudflare launched new controls allowing websites to demand payment from AI firms for data access, effectively tightening the screws on those trying to extract information without consent.

“AI crawlers have been scraping content without limits. Our goal is to put the power back in the hands of creators while still helping AI companies innovate.

“This is about safeguarding the future of a free and vibrant Internet with a new model that works for everyone,” the CEO stated.

The escalating clash between AI startups and infrastructure firms like Cloudflare comes at a time when legal questions surrounding data scraping, copyright, and consent remain unresolved. The core tension revolves around the very fuel of modern AI systems: data. With large language models hungry for ever-expanding datasets to improve performance, some companies have been accused of cutting corners in how they obtain that information.

Perplexity, founded by Srinivas and backed by Jeff Bezos and Nvidia, has positioned itself as a real-time, citation-focused search engine designed to counterbalance the dominance of Google and Bing. But its reliance on web-sourced content — including journalism — has made it a target for media companies, which have grown increasingly wary of their work being used to train AI tools without compensation.

With more publishers adding AI-blocking rules to their sites and companies like Cloudflare rolling out enforcement tools, Perplexity and its peers face growing pressure to justify how they obtain the data powering their products — and whether their methods can survive legal and reputational scrutiny.

No posts to display

Post Comment

Please enter your comment!
Please enter your name here