The first part of this conversation began with Matthew Berman’s suspicion that Anthropic’s latest model seemed capable of what he called “self-awareness.” Anthropic’s own analysis stopped at “eval awareness.” Philosophically, I argued, “self-awareness” carries ontological baggage — inner experience, moral status, an integrated “self” akin to an organism with an immune system. Functional signal in a benchmark is not the same as a living subjectivity.
I then asked whether Berman’s eagerness to invoke “self‑awareness” might be a commercial tactic to lure viewers into a discussion about benchmarking problems, and whether that would be legitimate so long as no unwarranted conclusions were drawn.
The commercial incentive is unmistakable. Headlines that suggest “AI may be becoming self‑aware” attract far more clicks, watch time and engagement than careful, technical critiques of benchmark design or eval contamination. That commercial logic is often internalized: content creators can come to believe the inflated version precisely because it pays.
But treating the tactic as merely a benign hook is too charitable. If a provocative frame truly serves to open a door to deeper analysis, it can be defensible. In practice, however, the more sensational framing frequently becomes the destination. “Self‑awareness” is not a harmless synonym for “eval awareness.” It imports an ontological schema that inclines audiences to interpret every subsequent detail through a lens of inner life and moral implications. The framing itself thus exerts conceptual influence independent of what the commentator explicitly claims.
This pattern — turning functional capability into ontological claim — is endemic in AI discourse. The gap between what models demonstrably do and what commentators imply they are gets exploited to generate either hype (AI as godlike) or fear (AI as autonomous threat). Both distort public understanding of what are genuinely difficult and important questions.
I’m willing to defend Berman against accusations of intellectual dishonesty to a degree. He operates in an attention economy that structurally rewards provocation. Titles like “Claude just got caught” are misleading and rhetorically suspect, but they are also adaptive responses to a media ecology where self‑promotion is essential for survival. In the twentieth century, commentators could rely on editors and institutions for promotion; today, they must market themselves.
That observation suggests a broader point: the misrepresentation of AI is not only an AI problem. It’s symptomatic of an information ecosystem in which attention is the currency, and capturing it justifies almost any means. When survival depends on being heard, assertiveness and exaggeration are adaptive. The result is structural — nuance is taxed, moderation punished, provocation rewarded.
Nevertheless, I resist collapsing systemic critique into absolution of individual responsibility. The system creates pressure, but people still exercise agency. Some commentators choose rigor and restraint at the expense of reach and revenue. That choice matters. Excusing every instance of sensationalism as “just the system” erases moral agency where it is needed most.
So the right stance is dual: diagnose the incentives that reward distortion, and hold individuals accountable where they could have done otherwise. The real political problem is the misalignment between micro‑incentives (survival, visibility, engagement) and macro‑interests (an informed public able to make wise collective decisions about consequential technologies). AI exposes this misalignment vividly, but it’s a general sickness of our information civilization.
A complication: this conversation itself participates in the same logic. The headline we chose — a calibrated provocation — is not indifferent to attention. That’s not hypocrisy; it’s realism. Operating entirely outside the system isn’t available to most of us. The question is degree: can substance justify the hook? In this case I think it can, but that judgement depends on the reader’s view.
Where does that leave us with Berman and others? The commercial motive is real. Using provocative framing can be legitimate if it truly leads to better public understanding. Too often, however, the frame preempts the analysis. The conceptual harm happens in the framing, not only in explicit claims.
I want to end not with a firm conclusion but with questions for readers: Is my exchange with Claude an honest dialogue, or a different kind of self‑promotion shaped by the same survival logic? Is the misrepresentation of AI primarily an ethical failing of individuals, or a systemic failure of civilization to align incentives with our collective informational needs? And finally, what do these dynamics tell us about human self‑awareness itself?
Please share your thoughts at [email protected]. We are gathering ideas and feelings from humans who interact with AI and will fold them into our ongoing conversation.


