Favorites

09.06.2022

Is Power-Seeking AI an Existential Risk?

Report for Open Philanthropy examining what I see as the core argument for concern about existential risk from misaligned artificial intelligence.

01.02.2024

Otherness and control in the age of AGI / Part 1:

Otherness and control in the age of AGI

Introduction and summary for a series of essays about how agents with different values should relate to one another, and about the ethics of seeking and sharing power.

01.18.2021

Actually possible: thoughts on Utopia

There are oceans we have barely dipped a toe into. There are drums and symphonies we can barely hear. There are suns whose heat we can barely feel on our skin.

01.30.2022

On infinite ethics

Infinities puncture the dream of a simple, bullet-biting utilitarianism. But they’re everyone’s problem.

05.08.2023

Predictable updating about AI risk

How worried about AI risk will we be when we can see advanced machine intelligence up close? We should worry accordingly now.

11.15.2023

New report: “Scheming AIs: Will AIs fake alignment during training in order to get power?”

My report examining the probability of a behavior often called “deceptive alignment.”

03.14.2021

Against neutrality about creating happy lives

Making happy people is good. Just ask the golden rule.

08.27.2021

Can you control the past?

I think that you can “control” events you have no causal interaction with, including events in the past, and that this is a wild and disorienting fact, with uncertain but possibly significant implications. This essay attempts to impart such disorientation.

02.07.2021

Killing the ants

If you kill something, look it in the eyes as you do.

02.17.2023

Seeing more whole

On looking out of your own eyes.

12.06.2020

Thoughts on being mortal

You can’t keep any of it. The only thing to do is to give it away on purpose.

12.23.2022

On sincerity

Nearby is the country they call life.

03.22.2021

On future people, looking back at 21st century longtermism

An intuition pump for a certain kind of “holy sh**” reaction to existential risk, and to the possible size and quality of the future at stake.

01.24.2021

On clinging

How can “non-attachment” be compatible with care? We need to distinguish between caring and clinging.