Thoughts on AI pause

Whether to push for an AI pause is a hotly debated question. This post contains some of my thoughts on the issue of AI pause and the discourse that surrounds it.

Contents

The motivation for an AI pause

Generally speaking, it seems that the primary motivation behind pushing for an AI pause is that work on AI safety is far from where it needs to be for humanity to maintain control of future AI progress. Therefore, a pause is needed so that work on AI safety — and other related work, such as AI governance — can catch up with the pace of progress in AI capabilities.

My thoughts on AI pause, in brief

Whether it is worth pushing for an AI pause obviously depends on various factors. For one, it depends on the opportunity cost: what could we be doing otherwise? After all, even if one thinks that an AI pause is desirable, one might still have reservations about its tractability compared to other aims. And even if one thinks that an AI pause is both desirable and tractable, there might still be other aims and activities that are even more beneficial (in expectation), such as working on worst-case AI safety (Gloor, 2016; Yudkowsky, 2017; Baumann, 2018), or increasing the priority that people devote to reducing risks of astronomical suffering (s-risks) (Althaus & Gloor, 2016; Baumann 2017; 2022; DiGiovanni, 2021).

Furthermore, there is the question of whether an AI pause would even be beneficial in the first place. This is a complicated question, and I will not explore it in detail here. (For a critical take, see “AI Pause Will Likely Backfire” by Nora Belrose.) Suffice it to say that, in my view, it seems highly uncertain whether any realistic AI pause would be beneficial overall — not just from a suffering-focused perspective, but from the perspective of virtually all impartial value systems. It seems to me that most advocates for AI pause are quite overconfident on this issue.

But to clarify, I am by no means opposed to advocating for an AI pause. It strikes me as something that one can reasonably conclude is helpful and worth doing (depending on one’s values and empirical judgement calls). But my current assessment is just that it is unlikely to be among the best ways to reduce future suffering, mainly because I view the alternative activities outlined above as being more promising, and because I suspect that most realistic AI pauses are unlikely to be clearly beneficial overall.

My thoughts on AI pause discourse

A related critical observation about much of the discourse around AI pause is that it tends toward a simplistic “doom vs. non-doom” dichotomy. That is, the picture that is conveyed seems to be that either humanity loses control of AI and goes extinct, which is bad; or humanity maintains control, which is good. And your probability of the former is your “p(doom)”.

Of course, one may argue that for strategic and communication purposes, it makes sense to simplify things and speak in such dichotomous terms. Yet the problem, in my view, is that this kind of picture is not accurate even to a first approximation. From an altruistic perspective, it is not remotely the case that “loss of control to AI” = “bad”, while “humans maintaining control” = “good”.

For example, if we are concerned with the reduction of s-risks (which is important by the lights of virtually all impartial value systems), we must compare the relative risks of “loss of control to AI” with the risks of “humans maintaining control” — however we define these rough categories. And sadly, it is not the case that “humans maintaining control” is associated with a negligible or trivial risk of worst-case outcomes. Indeed, it is not clear whether “humans maintaining control” is generally associated with better or worse prospects than “loss of control to AI” when it comes to s-risks.

In general, the question of whether a “human-controlled future” is better or worse with respect to reducing future suffering is a difficult one that has been discussed and debated at some length, and no clear consensus has emerged. As a case in point, Brian Tomasik places a 52 percent subjective probability on the claim that “Human-controlled AGI in expectation would result in less suffering than uncontrolled”.

This near-50/50 view stands in stark contrast to what often seems assumed as a core premise in much of the discourse surrounding AI pause, namely that a human-controlled future would obviously be far better (in expectation).

(Some reasons why one might be pessimistic regarding human-controlled futures can be found in the literature on human moral failings; see e.g. Cooper, 2018; Huemer, 2019; Kidd, 2020; Svoboda, 2022. Other reasons include basic competitive aims and dynamics that are likely to be found in a wide range of futures, including human-controlled ones; see e.g. Tomasik, 2013; Knutsson, 2023, sec. 3. See also Vinding, 2022.)

Massive moral urgency: Yes, in both categories of worst-case risks

There is a key point on which I agree strongly with advocates for an AI pause: there is a massive moral urgency in ensuring that we do not end up with horrific AI-controlled outcomes. Too few people appreciate this insight, and even fewer seem to be deeply moved by it.

At the same time, I think there is a similarly massive urgency in ensuring that we do not end up with horrific human-controlled outcomes. And humanity’s current trajectory is unfortunately not all that reassuring with respect to either of these broad classes of risks. (To be clear, this is not to say that an s-risk outcome is the most likely outcome in any of these two classes of future scenarios, but merely that the current trajectory looks highly suboptimal and concerning with respect to both of them.)

The upshot for me is that there is a roughly equal moral urgency in avoiding each of these categories of worst-case risks, and as hinted earlier, it seems doubtful to me that pushing for an AI pause is the best way to reduce these risks overall.

The motivation for an AI pause

My thoughts on AI pause, in brief

My thoughts on AI pause discourse

Massive moral urgency: Yes, in both categories of worst-case risks

Share this:

Related