Back to homepage
All Writing
12.18.2024
Takes on “Alignment Faking in Large Language Models”
What can we learn from recent empirical demonstrations of scheming in frontier models?
Continue reading
10.08.2024
Video and transcript of presentation on Otherness and control in the age of AGI
An attempt to distill down the whole “Otherness and control” series into a single talk.
Continue reading
09.30.2024
(Part 2, AI takeover) Extended audio/transcript from my conversation with Dwarkesh Patel
Extra content includes: AI collusion; the nature of intelligence; concrete takeover scenarios; flawed training signals; tribalism and mistake theory; more on what good outcomes look like.
Continue reading
09.30.2024
(Part 1, Otherness) Extended audio/transcript from my conversation with Dwarkesh Patel
Extra content includes: regretting alignment; predictable updating; dealing with potentially game-changing uncertainties; intersections between meditation and AI alignment; moral patienthood without consciousness; p(God).
Continue reading
06.18.2024
Otherness and control in the age of AGI / Part 11:
Loving a world you don’t trustGarden, campfire, healing water.
Continue reading
03.25.2024
Otherness and control in the age of AGI / Part 10:
On attunementExamining a certain kind of meaning-laden receptivity to the world.
Continue reading
2024
03/22
Video and transcript of presentation on Scheming AIs
03/21
On green: Part 9
01/18
On the abolition of man: Part 8
01/16
Being nicer than Clippy: Part 7
01/11
An even deeper atheism: Part 6
01/09
Does AI risk “other” the AIs?: Part 5
01/08
When “yang” goes wrong: Part 4
01/04
Deep atheism and AI risk: Part 3
01/02
Gentleness and the artificial Other: Part 2
01/02
Otherness and control in the age of AGI: Part 1
2023
11/15
New report: “Scheming AIs: Will AIs fake alignment during training in order to get power?”
10/18
Superforecasting the premises in “Is power-seeking AI an existential risk?”
10/15
In memory of Louise Glück
05/08
Predictable updating about AI risk
03/22
Existential Risk from Power-Seeking AI (shorter version)
02/21
A Stranger Priority? Topics at the Outer Reaches of Effective Altruism
02/17
Seeing more whole: Part 2
02/16
Why should ethical anti-realists do ethics?: Part 1
2022
12/23
On sincerity
12/01
Against meta-ethical hedonism
10/09
Against the normative realist’s wager
09/06
Is Power-Seeking AI an Existential Risk?
08/21
Video and Transcript of Presentation on Existential Risk from Power-Seeking AI
03/24
Dutch books, Cox, and Complete Class: Part 4
03/21
VNM, separability, and more: Part 3
03/18
Why it can be OK to predictably lose: Part 2
03/16
Skyscrapers and madmen: Part 1
02/18
Simulation arguments
01/30
On infinite ethics
01/17
The ignorance of normative realism bot
01/10
Morality and constrained maximization, part 2: Part 2
2021
12/21
Morality and constrained maximization, part 1: Part 1
11/28
Anthropics and the Universal Distribution
10/29
On the Universal Distribution
09/30
In defense of the presumptuous philosopher: Part 4
09/30
An aside on betting in anthropics: Part 3
09/30
Telekinesis, reference classes, and other scandals: Part 2
09/30
Learning from the fact that you exist: Part 1
08/27
Can you control the past?
07/19
In search of benevolence (or: what should you get Clippy for Christmas?)
06/21
On the limits of idealized values
04/19
Problems of evil
04/04
The innocent gene
03/28
The importance of how you weigh it
03/22
On future people, looking back at 21st century longtermism
03/14
Against neutrality about creating happy lives
03/07
Care and demandingness
03/01
Subjectivism and moral authority
02/21
Two types of deference
02/14
Contact with reality
02/07
Killing the ants
01/31
Believing in things you cannot see
01/24
On clinging
01/18
Actually possible: thoughts on Utopia
01/10
Shouldn’t it matter to the victim?
01/03
The despair of normative realism bot
2020
12/26
A ghost
12/20
Alienation and meta-ethics (or: is it possible you should maximize helium?)
12/12
Wholehearted choices and “morality as taxes”
12/06
Thoughts on being mortal
11/29
Grokking illusionism
11/22
The impact merge
11/16
Thoughts on personal identity
11/08
How core is confusion about consciousness?
11/01
To light a candle
10/24
The gestures of trees
10/17
Mistaking the plot
09/15
How much computational power does it take to match the human brain?