Perplexity, an AI search startup, has allegedly been circumventing restrictions designed to prevent its AI web crawlers from accessing select websites, according to Cloudflare. The report claims that when faced with a block, Perplexity disguises its crawling identity in an effort to sidestep website preferences, reports 24brussels.
This revelation heightens concerns regarding Perplexity’s activities, as the company previously faced accusations of bypassing paywalls and disregarding robots.txt files. CEO Aravind Srinivas had claimed responsibility lay with third-party crawlers used by the platform.
Cloudflare, a leading internet architecture provider, stated that it has received complaints from its customers reporting continued access for Perplexity’s bots despite implementing restrictions via robots.txt and creating Web Application Firewall (WAF) rules to limit these AI bots’ access.
To further investigate, Cloudflare established new domains with similar limitations against Perplexity’s AI scrapers. They discovered that the startup initially attempts to access sites while identifying itself with names such as “PerplexityBot” or “Perplexity-User.”
However, should a website restrict AI scraping, Cloudflare alleges that Perplexity modifies its user agent – the information that indicates the type of browser and device – to mimic Google Chrome on macOS. The company reportedly employs “rotating” IP addresses that were not disclosed in the list of IPs associated with its bots.
Moreover, Cloudflare asserts that Perplexity alters its autonomous system networks (ASN) – a unique identifier for groups of IP networks managed by a single operator – to navigate around blocks. This activity has been observed across tens of thousands of domains and millions of requests daily.
In response to Cloudflare’s findings, Perplexity spokesperson Jesse Dwyer labeled the report a “publicity stunt,” insisting there are significant misconceptions in the blog post. Consequently, Cloudflare has removed Perplexity’s verification as a bot and implemented measures to inhibit the startup’s alleged “stealth crawling.”
Cloudflare CEO Matthew Prince has been vocal about AI posing an “existential threat” to publishers. Last month, the firm began allowing websites to request payments from AI companies for content crawling, while also initiating default blocks on AI crawlers.