Hackernews Daily

The Podcast Collective

Perplexity AI exposed for stealthily scraping the web, dodging no-crawl rules 🚨

8/5/2025

Cloudflare exposes Perplexity AI's stealth crawling tactics

  • Perplexity’s crawlers bypass common no-crawl directives (robots.txt) by switching from declared bot user agents to generic browser strings, primarily mimicking Chrome on macOS.
  • When blocked, Perplexity rotates IP addresses and ASNs outside their official ranges to evade detection, violating ethical web crawling norms.
  • Cloudflare’s tests with private domains blocking all crawlers still showed Perplexity returning detailed data, indicating covert scraping.
  • Cloudflare responded by delisting Perplexity as a verified bot and deploying managed rules—available even on free plans—to detect and block these evasive crawlers.
  • The case highlights tensions between AI companies’ aggressive data harvesting for training and the web ecosystem’s control measures, underscoring the need for transparent bot behavior standards.

“Objects should shut the fuck up” — critique of excessive device noise

  • Modern consumer products like cars, washing machines, and baby monitors produce intrusive, often unnecessary audible alerts with minimal user control or configurability.
  • Examples include persistent, startling LPG warnings in cars and non-disableable beeps on every washing machine control interaction, increasing user annoyance and potentially reducing safety.
  • The author's frustrated tone underscores widespread alert fatigue caused by default sounds that prioritize notifications over user context or wellbeing.
  • Exceptions praised are devices with subtle, considerate alerts, such as dishwashers opening their doors silently after cycles or silent e-readers.
  • This calls for design philosophies that prioritize user control and reduce noise pollution in everyday technology.

Could interstellar object 3I/ATLAS be alien technology?

  • Researchers analyzed the recently discovered 3I/ATLAS’s unusual orbital dynamics and non-gravitational acceleration, hypothesizing it might be a technological artifact with possible intelligence and intent.
  • The object’s orbital tilt and trajectories near inner planets are statistically improbable for random interstellar visitors and could enable stealthy Solar System access.
  • The paper entertains the idea of a “Dark Forest” scenario where advanced civilizations might behave hostilely, suggesting 3I/ATLAS could be benign or malign.
  • The authors treat the hypothesis primarily as a pedagogical exercise, emphasizing the importance of scientific openness to such testable but speculative ideas.
  • The study provokes debate on interpreting limited data about interstellar visitors and the implications for SETI and planetary defense.

ChatGPT in university writing classes: a year-long experiment

  • UVA professor Piers Gelly integrated ChatGPT use into his writing curriculum, tasking 72 students to critically engage AI tools rather than banning them.
  • Students viewed AI skeptically yet pragmatically, using it for brainstorming and editing while recognizing its tendency toward bland and hallucinated content.
  • Classroom discussions highlighted differences between AI-generated “romanticized” prose and more mundane human writing, sparking reflection on storytelling and creativity.
  • Faculty found AI useful for grading speed and assignment design, though students largely preferred human feedback; most agreed human instructors remain essential.
  • The experiment illustrates a nuanced “messy middle” where human creativity and AI support coexist, suggesting collaborative rather than adversarial futures in education.

Perplexity is using stealth, undeclared crawlers to evade no-crawl directives

Cloudflare has revealed that Perplexity employs stealth crawlers that evade website no-crawl directives, such as those specified in robots.txt files. Rather than consistently identifying itself, Perplexity’s bot often masquerades as a regular web browser, rotating user agent strings and shifting IP addresses beyond its declared infrastructure. When confronted with strict access controls, the crawler adapts by bypassing these restrictions to continue collecting website data—behavior that stands in direct contrast with standard, transparent bot operations.

Their investigation found that even with robust countermeasures—like using private domains that denied all crawlers—Perplexity still retrieved and surfaced prohibited content, confirming active collection through deceptive methods. Cloudflare responded by delisting Perplexity as a verified bot and deploying new managed firewall rules to block its covert traffic, including protections available to free-tier customers. The exposé highlights the broader ethical divide between the responsible operation of web bots—exemplified by services that clearly declare user agents and honor bot exclusion rules—and aggressive data acquisition strategies leveraged by some AI companies.

Hacker News commenters broadly condemned Perplexity’s approach, emphasizing that the company’s tactics go beyond mere oversight, amounting to an erosion of website owner autonomy and trust. Technical participants praised Cloudflare’s forensic methodology, while others noted that these evolving evasive techniques fuel rapid innovation in bot detection and web security. The wider discussion underscored increasing anxiety about balancing AI development needs with content creator rights, and many called for stricter industry standards to govern ethical web crawling practices.

Objects should shut up

The central theme of the article is a pointed critique of the ubiquity and intrusiveness of audible alerts in modern consumer devices, from cars and appliances to baby monitors. The author—drawing from personal experience as a cybersecurity professional—emphasizes how incessant, often non-essential beeps and alarms undermine user well-being and even safety. This critique highlights that while such notifications are intended to be helpful, poor design choices often result in frustration, alert fatigue, and unforeseen hazards—such as being startled by a loud, non-urgent alert while driving at highway speeds.

Delving into user experiences, the article illustrates how lack of configurability and poor prioritization of alerts intensifies user annoyance. Devices frequently use loud sound cues for actions as routine as turning a dial or completing a task, with no way to silence or adjust notifications. The writer contrasts these missteps with well-executed alternatives: dishwashers that use subtle visual cues or silence, refrigerators with gentle audio warnings only for critical issues, and ebook readers that emit no sound at all. This evidences the importance of intentional design, where non-intrusive or user-configurable alerts can preserve utility without becoming a source of sensory overload.

Hacker News commenters echo the article’s call for thoughtful device design, expressing both solidarity with the frustrations described and proposing practical solutions. Many share their own aggravating encounters with overzealous alarms, advocate for user-selectable notification settings, and joke about the absurdity of alarm fatigue in everyday life. The most notable sentiment is a demand that silence be the default, reserving noise for emergencies—highlighting the disconnect between user needs and current product design conventions. The conversation reflects a broad consensus: manufacturers should rethink the prevalence and implementation of audible alerts, placing user autonomy and quality of life at the forefront.

Is the interstellar object 3I/ATLAS alien technology? [pdf]

A recent research paper treats the possibility that 3I/ATLAS might be a technological artifact, potentially with extraterrestrial origin, as a speculative but structured scientific hypothesis. The authors, affiliated with Harvard and the Initiative for Interstellar Studies, note that the object’s unusual orbital characteristics—most notably its improbably close approaches to Venus, Mars, and Jupiter, combined with a retrograde tilt—might offer benefits for stealth entry into the inner Solar System if it were under intelligent control. While highlighting that this scenario is presented largely as a “pedagogical exercise,” the analysis draws attention to weak but measurable non-gravitational accelerations that could be consistent with controlled movement, and entertains the “Dark Forest” view that advanced civilizations might act defensively or covertly.

Additional findings in the paper include technical breakdowns of orbital probabilities and the implications of a reverse Solar Oberth maneuver, a concept whereby a spacecraft could use a close pass to the Sun to decelerate and become gravitationally captured—an advanced tactic for a visiting probe. The potential intent inferred from 3I/ATLAS’s predicted trajectory—leading toward close encounters with both Earth and Jupiter—adds to the intrigue, as the odds of such a sequence from purely random interstellar passage are calculated to be extremely slim (less than 0.005%). Nonetheless, the researchers emphasize that they are not advocating for the alien technology scenario as the most plausible explanation, but argue that its outsized existential implications warrant rigorous consideration alongside more conventional natural explanations.

Community reaction on Hacker News was marked by healthy skepticism edged with curiosity, reflecting both the limitations of the available data (this being only the third interstellar visitor observed) and a cultural wariness around speculating about SETI claims. While many readers dismissed the hypothesis as highly unlikely given the clear cometary nature and physical evidence favoring a natural origin, some appreciated the intellectual value of exploring such a “what-if” scenario. Astrodynamics enthusiasts engaged closely with the technical rationales, while others injected humor—suggesting perhaps it’s only an “alien coffee stop”—and noted the practical impossibility of intercepting or imaging 3I/ATLAS in detail. The prevailing tone favored scientific open-mindedness without sensationalism, treating the hypothesis as a valuable thought experiment rather than evidence of impending alien contact.

I tried to replace myself with ChatGPT in my English class

An experiment conducted at the University of Virginia explored the integration of ChatGPT into undergraduate English writing classes, focusing on whether AI tools could complement or even replace aspects of traditional instruction. The central insight was that students, when given structured opportunities to use AI, quickly recognized both its potential as a brainstorming and editing assistant and its significant limitations compared to human creativity and judgment. Throughout the study, students largely affirmed the enduring value of human-led instruction, citing the unique contributions of personal feedback, nuanced critique, and communal learning that AI could not mimic.

Further findings revealed that while students employed ChatGPT for ideation and drafting, they encountered issues with its outputs, such as generic prose, repetitive phrasing, and factual inaccuracies. The classroom became a space where students honed their critical writing skills by learning to identify and critique these “AI tics”—a particularly vivid exercise was the side-by-side comparison of an AI-generated passage with a student’s own prose, prompting in-depth discussions on style, authenticity, and the nature of creativity. Teachers themselves experimented with AI to streamline grading and assignment creation, but student reactions indicated a strong preference for human engagement, highlighting a "messy middle" where AI is most useful as a supplement rather than a standalone replacement.

On Hacker News, the community reaction reflected broad skepticism toward full-scale AI replacement of educators, even among those impressed by the model’s surface-level writing ability. Commenters repeatedly emphasized the importance of the human element in education, with many echoing the experiment’s conclusion that AI is best leveraged as a collaborative tool rather than a substitute for genuine teaching. The debate extended to questions about academic integrity, the shifting definition of cheating, and concerns over generative AI’s tendency toward homogenization—undercutting the diverse voices and creative risks that define meaningful writing.