AI Radar #27
Three visions of the future
“When we look back at this time, I think we will realize that we were standing in the foothills of the singularity.” - Demis Hassabis
This week brings us three very different visions of the future. Pope Leo presents AI as an important but ultimately normal technology and argues for a normal, human-centered future. Far from normal, Jack Clark and MIRI pursue different visions of AI to their logical conclusions. Jack urges us to prepare for the coming age of confusing miracles, while MIRI brings us a draconian plan to avert an AI apocalypse.
Why such different futures? The crux is factual more than ideological. The pope sees a normal future because he (apparently) doesn’t expect AGI. Jack and MIRI agree that AI would be transformative, but disagree about the probability of it going disastrously wrong—and therefore, whether we should lean into the AI future, or fight to avert it.
To my mind, the events of the last five years are most consistent with Jack’s view of AI, but the uncertainty is uncomfortably large. Were I in charge, we would double down on alignment and prepare for the upheaval ahead, while simultaneously laying the groundwork for a stop should it become necessary.
Top pick
Jack Clark: Reckoning with the future
Tell me how the world stays normal, based on this technology and how it is showing up in the world?
Jack Clark’s latest piece, adapted from his 2026 Cosmos HAI Lab Lecture, is an overview of where we stand and a plea to engage seriously with the future that is clearly coming.
Jack explores the trajectory of AI both at an abstract level and a personal one: he shows us capability charts that go straight up, but also talks about when AI got good enough to convince him to go back to therapy. It’s a great piece that is simultaneously convincing and moving.
Following current trends to their obvious conclusion, he argues that profound transformation is all but inevitable and our ability to steer it depends on accepting the reality of what is about to happen:
In short, the rapid advance in AI technology presents all of us with a choice: explore the future, or retreat from the present.
Exploring the future requires us to reckon with the fact of continued AI progress, and ask ourselves what we want to do with this technology as it becomes more powerful. Retreating from the present is when we ignore the implications of the technology and dismiss it. Retreating from the present forces us as individuals and as society into states of reactivity or passivity in the face of AIs continued advance.
At this point, if you don’t expect transformative AI in the near future you need to explain how that belief is plausibly consistent with the plainly observable facts.
News
Magnifica Humanitas
Pope Leo’s first encyclical, Magnifica Humanitas, deals with “safeguarding the human person in the time of artificial intelligence”. It’s fine but inconsequential. You probably don’t need to read it, but if you do I recommend this reformatted version.
Anthropic’s Chris Olah spoke at the presentation. The New York Times shares the key takeaways. Zvi does a deep dive. Dean Ball isn’t a fan. Simon Willison shares his highlights.
There are few surprises here—it’s high-level moral guidance about AI from a Catholic perspective. Three things strike me:
The most important shortcoming is that while he clearly understands the mundane aspects of AI, he isn’t AGI pilled. AGI and superintelligence will soon raise profound moral questions, which this encyclical doesn’t foresee and doesn’t begin to address.
Item 99 takes a strong stand against AI having moral agency: “Nor do they have a moral conscience, since they do not judge good and evil, grasp the ultimate meaning of situations, or bear responsibility for consequences”. I’m not a fan and I don’t think this will age well, but—you know the pope is Catholic, right?
Zvi and Dean Ball both observe that the whole document feels strangely Eurocratic. I concur—it’s superficially unobjectionable, but saturated with misguided good intentions.
What is China saying about collaboration on AI?
Trump’s recent visit to China included some discussion of collaboration on AI safety, although it’s unclear whether anything substantive will result. Bravely attempting to read the tea leaves, Concordia reviews Chinese coverage of the discussion.
Capabilities and forecasts
The Commonwealth Short Story Prize
The Commonwealth Short Story Prize is a well-known literary contest, with five annual winners from different regions of the Commonwealth. Three of this year’s winning stories appear to have been substantially written by AI.
Writing for The Argument, Kelsey Piper explains what happened, teaches a crash course in identifying AI writing, and unpacks the Commonwealth Foundation’s incoherent response to the situation.
Two things can simultaneously be true:
Ignoring questions of authorship, the short stories in question have significant shortcomings.
Nonetheless, they won a notable literary competition.
We long ago passed the milestone of AI being a better writer than most humans. The best human writers are still much better than AI—for now. Write faster, John Henry, write faster.
Alignment and interpretability
Meta-gaming in evaluations
It’s that the frontier models are good at telling when they’re being evaluated. I hadn’t realized, however, just how evaluation-aware they’ve become. Drawing from METR’s recent Frontier Risk Report, Beth Barnes shares some impressive examples of models not merely knowing that they’re being evaluated, but trying to guess how the evaluations will be scored:
Are we building conscious AI servants?
Dan Williams and Henry Shevlin discuss AI consciousness on Conspicuous Cognition. I particularly appreciate the discussion of how much we actually understand about consciousness in humans and animals (spoiler: not much at all). This is perfect:
I’m a philosophical naturalist, which is jargon for the idea that philosophy should be continuous with, highly constrained by, the scientific project. Whenever people are trying to settle an argument by trading intuitions, I start to think this is probably not a legitimate contribution to knowledge.
Dean Ball on AI consciousness
Dean Ball offers the best short explanation of how I (and many people in this corner of the world) think about AI consciousness:
This is all very weird, very outside the Overton, and very confusing. I don’t really know what to say, beyond that we should take this stuff seriously, have an open mind, and do rigorous science. Anyone who speaks with confidence about this in either direction is just fooling themselves.
Strategy and politics
A summary of MIRI’s proposed international agreement
MIRI’s An International Agreement to Prevent the Premature Creation of Artificial Superintelligence is 55 pages long, so I was glad to see their new summary of it.
I disagree with MIRI about some critical points, but I appreciate the principled way they follow their beliefs to their logical conclusions. If you believe that developing superintelligence in the foreseeable future is almost certain to cause human extinction, then you should be willing to take drastic steps to prevent that outcome, no matter the cost. The paper is a thoughtful examination of what those drastic steps might look like.
This is a proposed draft of an international agreement to prevent the development of superintelligence. There’s a lot here, but five parts seem most important to me:
1. Limits on AI training runs. Training runs above 1.0e22 FLOP (less than GPT-3 scale) would require monitoring and runs above 1.0e24 FLOP (roughly GPT-3.5 scale, covering all current frontier models) would be prohibited. Training runs between those two thresholds would be required to disclose all code and training data.
2. Limits on compute clusters. Compute clusters above 16 H100 equivalents would only be permitted inside centralized monitored facilities and subject to intensive monitoring.
3. Limits on research. Research on certain topics would be “controlled” or “prohibited”, including research on general AI capabilities, AI chip manufacturing techniques, and improved AI chips.
4. Restrictions on researchers. Researchers who are working on relevant or adjacent topics, or who have previously worked on them, would be subject to mandatory interviews conducted by both the US and China, as well as ongoing monitoring of their employment status.
5. Military action. The agreement gives signatories the right to take action—including military action—against any actor they deem to be pursuing ASI. It further requires co-signatories to agree not to consider those strikes as acts of aggression.
I agree that if your intent is to ensure that nobody develops superintelligence for several decades, you need to take extreme measures. But let’s be clear: this is a draconian proposal with severe costs.
The economic costs are obvious, as are the opportunity costs of not merely pausing AI development but effectively rolling it back several years—and staying at that level for several decades. The political costs, arguably, are even greater. This agreement would create a massive US/China government entity with far-reaching powers to control and monitor the most important technology of our time and the people working on it.
If MIRI is right about the near-certainty of calamity on our current course, an agreement like this might be humanity’s only chance. If they’re wrong, this proposal is an abomination.
Predicting AI job exposure
It’s become popular to try to measure “AI exposure”—the likelihood that any given profession will be adversely impacted by AI. Benedict Evans argues that this is a futile endeavor: past experience shows that job impacts are strange and unpredictable.
To take one simple example: pre-internet, it would have been possible to analyze all the tasks performed by a newspaper copy editor and estimate an “internet exposure” number for that profession. But that analysis would completely miss the fact that the internet would devastate the entire newspaper industry.
It’s a great article, and full of the deep knowledge of tech history that Benedict excels at. The standard caveat applies, of course: if capabilities proceed at the rate I expect, AI will take approximately all the jobs, approximately everywhere, approximately all at once.
Risks
We are not on top of it
METR’s Beth Barnes would like to remind you that things are not OK:
Sometimes people outside the field say things like “The AI situation can’t be that bad, there must be experts who are on top of it”. As “an expert”, I would like to be clear that we are not on top of it.
ExploitBench
ExploitBench is a new benchmark that measures performance on complex, multi-step cybersecurity tasks.
Mythos significantly outperforms GPT on this benchmark: it scores 68% without auto-nudge (a harness feature that nudges the model to keep trying), while GPT 5.5 (Codex) scored 41% with auto-nudge and 33% without it. For cyber, my intuition is that Mythos and GPT 5.5 are comparable at some tasks, while Mythos is considerably more capable at others.
AI for criminals
AI Policy Perspectives talks with Ardi Janjeva about criminal use of AI. We are already seeing early signs of AI increasing criminal productivity—that trend will only accelerate. Three points stand out to me:
AI makes sophisticated attacks possible at scale, including mass deployment of highly targeted attacks.
It’s becoming possible for individuals to conduct sophisticated crimes that previously would have required an organized criminal gang. This obviously complicates life for law enforcement.
Law enforcement is facing a tidal wave of AI-generated child sexual abuse material. That’s enormously time-consuming to deal with, since each item needs to be investigated to see whether it depicts a real child in danger.
People and data
Inside the British lab hunting for dangers lurking in AI
The New York Times profiles the UK’s AI Security Institute, one of the world’s leading AI safety organizations.
Jaan Tallinn discusses existential risk
As part of their upcoming documentary project, Manifold talks with Jaan Tallinn. Jaan’s been active in AI safety for a long time: he co-founded the Future of Life Institute and the Center for the Study of Existential Risk. It’s a good interview with a thought leader who doesn’t get as much attention as he perhaps deserves. I’m really enjoying the individual interviews as they come out, and am excited for the final documentary.
Will we really put data centers in space?
Does it really make sense to put data centers in space? Surprisingly, Forethought’s latest research suggests that it might be economically viable within three to five years:
All of this is only feasible if SpaceX’s Starship goes well: there are no alternative launch options that make economic sense.
While orbital data centers might have some cost advantages, they’re perhaps most interesting as a potential end-run around regulatory and power generation limitations that might severely constrain terrestrial data centers.
Frontier labs don’t use most AI compute (yet)
I would not have guessed this. Epoch estimates that the frontier labs are using less than half of the world’s AI compute:






