Joe Carlsmith
  • About
  • Archive
Back to homepage

All Writing

Most recentArchive
05.22.2025
Video and transcript of talk on AI welfare

An overview of my take on AI welfare as of May 2025, from a talk I gave at Anthropic.

Continue reading
05.21.2025
On the moral status of AIs / Part 1:
The stakes of AI moral status

On seeing and not seeing souls.

Continue reading
04.30.2025
Video and transcript of talk on automating alignment research

From a talk at Anthropic in April 2025.

Continue reading
04.30.2025
How do we solve the alignment problem? / Part 6:
Can we safely automate alignment research?

It’s really important; we have a real shot; there are a lot of ways we can fail.

Continue reading
03.14.2025
How do we solve the alignment problem? / Part 5:
AI for AI safety

We should try extremely hard to use AI labor to help address the alignment problem.

Continue reading
03.11.2025
How do we solve the alignment problem? / Part 4:
Paths and waystations in AI safety

On the structure of the path to safe superintelligence, and some possible milestones along the way.

Continue reading

2025

02/19
When should we worry about AI power-seeking?: Part 3
02/13
How do we solve the alignment problem?: Part 1
02/13
What is it to solve the alignment problem?: Part 2
01/28
Fake thinking and real thinking

2024

12/18
Takes on “Alignment Faking in Large Language Models”
10/08
Video and transcript of presentation on Otherness and control in the age of AGI
09/30
(Part 2, AI takeover) Extended audio/transcript from my conversation with Dwarkesh Patel
09/30
(Part 1, Otherness) Extended audio/transcript from my conversation with Dwarkesh Patel
06/18
Loving a world you don’t trust: Part 11
03/25
On attunement: Part 10
03/22
Video and transcript of presentation on Scheming AIs
03/21
On green: Part 9
01/18
On the abolition of man: Part 8
01/16
Being nicer than Clippy: Part 7
01/11
An even deeper atheism: Part 6
01/09
Does AI risk “other” the AIs?: Part 5
01/08
When “yang” goes wrong: Part 4
01/04
Deep atheism and AI risk: Part 3
01/02
Gentleness and the artificial Other: Part 2
01/02
Otherness and control in the age of AGI: Part 1

2023

11/15
New report: “Scheming AIs: Will AIs fake alignment during training in order to get power?”
10/18
Superforecasting the premises in “Is power-seeking AI an existential risk?”
10/15
In memory of Louise Glück
05/08
Predictable updating about AI risk
03/22
Existential Risk from Power-Seeking AI (shorter version)
02/21
A Stranger Priority? Topics at the Outer Reaches of Effective Altruism
02/17
Seeing more whole: Part 2
02/16
Why should ethical anti-realists do ethics?: Part 1

2022

12/23
On sincerity
12/01
Against meta-ethical hedonism
10/09
Against the normative realist’s wager
09/06
Is Power-Seeking AI an Existential Risk?
08/21
Video and Transcript of Presentation on Existential Risk from Power-Seeking AI
03/24
Dutch books, Cox, and Complete Class: Part 4
03/21
VNM, separability, and more: Part 3
03/18
Why it can be OK to predictably lose: Part 2
03/16
Skyscrapers and madmen: Part 1
02/18
Simulation arguments
01/30
On infinite ethics
01/17
The ignorance of normative realism bot
01/10
Morality and constrained maximization, part 2: Part 2

2021

12/21
Morality and constrained maximization, part 1: Part 1
11/28
Anthropics and the Universal Distribution
10/29
On the Universal Distribution
09/30
In defense of the presumptuous philosopher: Part 4
09/30
An aside on betting in anthropics: Part 3
09/30
Telekinesis, reference classes, and other scandals: Part 2
09/30
Learning from the fact that you exist: Part 1
08/27
Can you control the past?
07/19
In search of benevolence (or: what should you get Clippy for Christmas?)
06/21
On the limits of idealized values
04/19
Problems of evil
04/04
The innocent gene
03/28
The importance of how you weigh it
03/22
On future people, looking back at 21st century longtermism
03/14
Against neutrality about creating happy lives
03/07
Care and demandingness
03/01
Subjectivism and moral authority
02/21
Two types of deference
02/14
Contact with reality
02/07
Killing the ants
01/31
Believing in things you cannot see
01/24
On clinging
01/18
Actually possible: thoughts on Utopia
01/10
Shouldn’t it matter to the victim?
01/03
The despair of normative realism bot

2020

12/26
A ghost
12/20
Alienation and meta-ethics (or: is it possible you should maximize helium?)
12/12
Wholehearted choices and “morality as taxes”
12/06
Thoughts on being mortal
11/29
Grokking illusionism
11/22
The impact merge
11/16
Thoughts on personal identity
11/08
How core is confusion about consciousness?
11/01
To light a candle
10/24
The gestures of trees
10/17
Mistaking the plot
09/15
How much computational power does it take to match the human brain?
@ Joe Carlsmith, 2025
  • Policies
Designed by And–Now