My work at Open Philanthropy focuses specifically on making sure that the development of advanced AI systems does not lead to existential catastrophe. I've written a long report ("Is Power-Seeking AI an Existential Risk") on what I see as the most important risk here -- namely, that misaligned AI systems end up disempowering humanity. There's also a video summary of that report available here; slides here; and a set of reviews here. Before that, I wrote a report on the computational capacity of the human brain (blog post summary here), as part of a broader investigation at Open Philanthropy into when advanced AI systems might be developed (see Holden Karnofsky's "Most Important Century" series for a summary of that broader investigation).
Extra content includes: AI collusion; the nature of intelligence; concrete takeover scenarios; flawed training signals; tribalism and mistake theory; more on what good outcomes look like.
Extra content includes: regretting alignment; predictable updating; dealing with potentially game-changing uncertainties; intersections between meditation and AI alignment; moral patienthood without consciousness; p(God).
Garden, campfire, healing water.
Examining a certain kind of meaning-laden receptivity to the world.
An intro to my work on scheming/”deceptive alignment.”
Examining a philosophical vibe that I think contrasts in interesting ways with “deep atheism.”
What does it take to avoid tyranny towards the future?
Let’s be the sort of species that aliens wouldn’t fear the way we fear paperclippers.
Who isn’t a paperclipper?
Examining Robin Hanson’s critique of the AI risk discourse.
On the connection between deep atheism and seeking control.
On a certain kind of fundamental mistrust towards Nature.
AIs as fellow creatures. And on getting eaten.
Introduction and summary for a series of essays about how agents with different values should relate to one another, and about the ethics of seeking and sharing power.
My report examining the probability of a behavior often called “deceptive alignment.”
Superforecasters weigh in on the argument for AI risk given in my report on the topic.
How worried about AI risk will we be when we can see advanced machine intelligence up close? We should worry accordingly now.
Building a second advanced species is playing with fire.
Report for Open Philanthropy examining what I see as the core argument for concern about existential risk from misaligned artificial intelligence.
Video and transcript of a presentation I gave on existential risk from power-seeking AI, summarizing my report on the topic.
Report for Open Philanthropy on the computational power sufficient to match the human brain’s task-performance. I examine four different methods of generating estimates in this respect.