Distrusting salience: Keeping unseen urgencies in mind

The psychological appeal of salient events and risks can be a major hurdle to optimal altruistic priorities and impact. My aim in this post is to outline a few reasons to approach our intuitive fascination with salient events and risks with a fair bit of skepticism, and to actively focus on that which is important yet unseen, hiding in the shadows of the salient.


Contents

  1. General reasons for caution: Availability bias and related biases
  2. The news: A common driver of salience-related distortions
  3. The narrow urgency delusion
  4. Massive problems that always face us: Ongoing moral disasters and future risks
  5. Salience-driven distortions in efforts to reduce s-risks
  6. Reducing salience-driven distortions

The human mind is subject to various biases that involve an overemphasis on the salient, i.e. that which readily stands out and captures our attention.

In general terms, there is the availability bias, also known as the availability heuristic, namely the common tendency to base our beliefs and judgments on information that we can readily recall. For example, we tend to overestimate the frequency of events when examples of these events easily come to mind.

Closely related is what is known as the salience bias, which is the tendency to overestimate salient features and events when making decisions. For instance, when deciding to buy a given product, the salience bias may lead us to give undue importance to a particularly salient feature of that product — e.g. some fancy packaging — while neglecting less salient yet perhaps more relevant features.

A similar bias is the recency bias: our tendency to give disproportionate weight to recent events in our belief-formation and decision-making. This bias is in some sense predicted by the availability bias, since recent events tend to be more readily available to our memory. Indeed, the availability bias and the recency bias are sometimes considered equivalent, even though it seems more accurate to view the recency bias as a consequence or a subset of the availability bias; after all, readily remembered information does not always pertain to recent events.

Finally, there is the phenomenon of belief digitization, which is the tendency to give undue weight to (what we consider) the single most plausible hypothesis in our inferences and decisions, even when other hypotheses also deserve significant weight. For example, if we are considering hypotheses A, B, and C, and we assign them the probabilities 50 percent, 30 percent, and 20 percent, respectively, belief digitization will push us toward simply accepting A as though it were true. In other words, belief digitization pushes us toward altogether discarding B and C, even though B and C collectively have the same probability as A. (See also related studies on Salience Theory and on the overestimation of salient causes and hypotheses in predictive reasoning.)

All of the biases mentioned above can be considered different instances of a broader cluster of availability/salience biases, and they each give us reason to be cautious of the influence that salient information has on our beliefs and our priorities.

One way in which our attention can become preoccupied with salient (though not necessarily crucial) information is through the news. Much has been written against spending a lot of time on the news, and the reasons against it are probably even stronger for those who are trying to spend their time and resources in ways that help sentient beings most effectively.

For even if we grant that there is substantial value in following the news, it seems plausible that the opportunity costs are generally too high, in terms of what one could instead spend one’s limited time learning about or advocating for. Moreover, there is a real risk that a preoccupation with the news has outright harmful effects overall, such as by gradually pulling one’s focus away from the most important problems and toward less important and less neglected problems. After all, the prevailing news criteria or news values decidedly do not reflect the problems that are most important from an impartial perspective concerned with the suffering of all sentient beings.

I believe the same issue exists in academia: A certain issue becomes fashionable, there are calls for abstracts, and there is a strong pull to write and talk about that given issue. And while it may indeed be important to talk and write about those topics for the purpose of getting ahead — or not falling behind — in academia, it seems more doubtful whether such topical talk is at all well-adapted for the purpose of making a difference in the world. In other words, the “news values” of academia are not necessarily much better than the news values of mainstream journalism.

The narrow urgency delusion

A salience-related pitfall that we can easily succumb to when following the news is what we may call the “narrow urgency delusion”. This is when the news covers some specific tragedy and we come to feel, at a visceral level, that this tragedy is the most urgent problem that is currently taking place. Such a perception is, in a very important sense, an illusion.

The reality is that tragedy on an unfathomable scale is always occurring, and the tragedies conveyed by the news are sadly but a tiny fraction of the horrors that are constantly taking place around us. Yet the tragedies that are always occurring, such as children who suffer and die from undernutrition and chickens who are boiled alive, are so common and so underreported that they all too readily fade from our moral perception. To our intuitions, these horrors seemingly register as mere baseline horror — as unsalient abstractions that carry little felt urgency — even though the horrors in question are every bit as urgent as the narrow sliver of salient horrors conveyed in the news (Vinding, 2020, sec. 7.6).

We should thus be clear that the delusion involved in the narrow urgency delusion is not the “urgency” part — there is indeed unspeakable horror and urgency involved in the tragedies reported by the news. The delusion rather lies in the “narrow” part; we find ourselves in a condition that contains extensive horror and torment, all of which merits compassion and concern.

So it is not that the salient victims are less important than what we intuitively feel, but rather that the countless victims whom we effectively overlook are far more important than what we (do not) feel.

Massive problems that always face us: Ongoing moral disasters and future risks

The following are some of the urgent problems that always face us, yet which are often less salient to us than the individual tragedies that are reported in the news:

These common and ever-present problems are, by definition, not news, which hints at the inherent ineffectiveness of news when it comes to giving us a clear picture of the reality we inhabit and the problems that confront us.

As the final entry on the list above suggests, the problems that face us are not limited to ongoing moral disasters. We also face risks of future atrocities, potentially involving horrors on an unprecedented scale. Such risks will plausibly tend to feel even less salient and less urgent than do the ongoing moral disasters we are facing, even though our influence on these future risks — and future suffering in general — could well be more consequential given the vast scope of the long-term future.

So while salience-driven biases may blind us to ongoing large-scale atrocities, they probably blind us even more to future suffering and risks of future atrocities.

Salience-driven distortions in efforts to reduce s-risks

There are many salience-related hurdles that may prevent us from giving significant priority to the reduction of future suffering. Yet even if we do grant a strong priority to the reduction of future suffering, including s-risks in particular, there are reasons to think that salience-driven distortions still pose a serious challenge in our prioritization efforts.

Our general availability bias gives us some reason to believe that we will overemphasize salient ideas and hypotheses in efforts to reduce future suffering. Yet perhaps more compelling are the studies on how we tend to greatly overestimate salient hypotheses when we engage in predictive and multi-stage reasoning in particular. (Multi-stage reasoning is when we make inferences in successive steps, such that the output of one step provides the input for the next one.)

After all, when we are trying to predict the main sources of future suffering, including specific scenarios in which s-risks materialize, we are very much engaging in predictive and multi-stage reasoning. Therefore, we should arguably expect our reasoning about future causes of suffering to be too narrow by default, with a tendency to give too much weight to a relatively small set of salient risks at the expense of a broader class of less salient (yet still significant) risks that we are prone to dismiss in our multi-stage inferences and predictions.

This effect can be further reinforced through other mechanisms. For example, if we have described and explored — or even just imagined — a certain class of risks in greater detail than other risks, then this alone may lead us to regard those more elaborately described risks as being more likely than less elaborately explored scenarios. Moreover, if we find ourselves in a group of people who focus disproportionally on a certain class of future scenarios, this may further increase the salience and perceived likelihood of these scenarios, compared to alternative scenarios that may be more salient in other groups and communities.

Reducing salience-driven distortions

The pitfalls mentioned above seem to suggest some concrete ways in which we might reduce salience-driven distortions in efforts to reduce future suffering.

First, they recommend caution about the danger of neglecting less salient hypotheses when engaging in predictive and multi-stage reasoning. Specifically, when thinking about future risks, we should be careful not to simply focus on what appears to be the single greatest risk, and to effectively neglect all others. After all, even if the risk we regard as the single greatest risk indeed is the single greatest risk, that risk might still be fairly modest compared to the totality of future risks, and we might still do better by deliberately working to reduce a relatively broad class of risks.

Second, the tendency to judge scenarios to be more likely when we have thought about them in detail would seem to recommend that we avoid exploring future risks in starkly unbalanced ways. For instance, if we have explored one class of risks in elaborate detail while largely neglecting another, it seems worth trying to outline concrete scenarios that exemplify the more neglected class of risks, so as to correct any potentially unjustified disregard of their importance and likelihood.

Third, the possibility that certain ideas can become highly salient in part for sociological reasons may recommend a strategy of exchanging ideas with, and actively seeking critiques from, people who do not fully share the outlook that has come to prevail in one’s own group.

In general, it seems that we are likely to underestimate our empirical uncertainty (Vinding, 2020, sec. 9.1-9.2). The space of possible future outcomes is vast, and any specific risk that we may envision is but a tiny subset of the risks we are facing. Hence, our most salient ideas regarding future risks should ideally be held up against a big question mark that represents the many (currently) unsalient risks that confront us.

Put briefly, we need to cultivate a firm awareness of the limited reliability of salience, and a corresponding awareness of the immense importance of the unsalient. We need to make an active effort to keep unseen urgencies in mind.

What does a future dominated by AI imply?

Among altruists working to reduce risks of bad outcomes due to AI, I sometimes get the impression that there is a rather quick step from the premise “the future will be dominated by AI” to a practical position that roughly holds that “technical AI safety research aimed at reducing risks associated with fast takeoff scenarios is the best way to prevent bad AI outcomes”.

I am not saying that this is the most common view among those who work to prevent bad outcomes due to AI. Nor am I saying that the practical position outlined above is necessarily an unreasonable one. But I think I have seen (something like) this sentiment assumed often enough for it to be worthy of a critique. My aim in this post is to argue that there are many other practical positions that one could reasonably adopt based on that same starting premise.


Contents

  1. “A future dominated by AI” can mean many things
    1. “AI” can mean many things
    2. “Dominated by” can mean many things
    3. Combinations of many things
  2. Future AI dominance does not imply fast AI development
  3. Fast AI development does not imply concentrated AI development
  4. “A future dominated by AI” does not mean that either “technical AI safety” or “AI governance” is most promising
  5. Concluding clarification

“A future dominated by AI” can mean many things

“AI” can mean many things

It is worth noting that the premise that “the future will be dominated by AI” covers a wide range of scenarios. After all, it covers scenarios in which advanced machine learning software is in power; scenarios in which brain emulations are in power; as well as scenarios in which humans stay in power while gradually updating their brains with gene technologies, brain implants, nanobots, etc., such that their intelligence would eventually be considered (mostly) artificial intelligence by our standards. And there are surely more categories of AI than just the three broad ones outlined above.

“Dominated by” can mean many things

The words “in power” and “dominated by” can likewise mean many different things. For example, they could mean anything from “mostly in power” and “mostly dominated by” to “absolutely in power” and “absolutely dominated by”. And these respective terms cover a surprisingly wide spectrum.

After all, a government in a democratic society could reasonably be claimed to be “mostly in power” in that society, and a future AI system that is given similar levels of power could likewise be said to be “mostly in power” in the society it governs. By contrast, even the government of North Korea falls considerably short of being “absolutely in power” on a strong definition of that term, which hints at the wide spectrum of meanings covered by the general term “in power”.

Note that the contrast above actually hints at two distinct (though related) dimensions on which different meanings of “in power” can vary. One has to do with the level of power — i.e. whether one has more or less of it — while the other has to do with how the power is exercised, e.g. whether it is democratic or totalitarian in nature.

Thus, “a future society with AI in power” could mean a future in which AI possesses most of the power in a democratically elected government, or it could mean a future in which AI possesses total power with no bounds except the limits of physics.

Combinations of many things

Lastly, we can make a combinatorial extension of the points made above. That is, we should be aware that “a future dominated by AI” could — and is perhaps likely to — combine different kinds of AI. For instance, one could imagine futures that contain significant numbers of AIs from each of the three broad categories of AI mentioned above.

Additionally, these AIs could exercise power in distinct ways and in varying degrees across different parts of the world. For example, some parts of the world might make decisions in ways that resemble modern democratic processes, with power distributed among many actors, while other parts of the world might make decisions in ways that resemble autocratic decision procedures.

Such a diversity of power structures and decision procedures may be especially likely in scenarios that involve large-scale space expansion, since different parts of the world would then eventually be causally disconnected, and since a larger volume of AI systems presumably renders greater variation more likely in general.

These points hint at the truly vast space of possible futures covered by a term such as “a future dominated by AI”.

Future AI dominance does not imply fast AI development

Another conceptual point is that “a future dominated by AI” does not imply that technological or social progress toward such a future will happen soon or that it will occur suddenly. Furthermore, I think one could reasonably argue that such an imminent or sudden change is quite unlikely (though it obviously becomes more likely the broader our conception of “a future dominated by AI” is).

An elaborate justification for my low credence in such sudden change is beyond the scope of this post, though I can at least note that part of the reason for my skepticism is that I think trends and projections in both computer hardware and economic growth speak against such rapid future change. (For more reasons to be skeptical, see Reflections on Intelligence and “A Contra AI FOOM Reading List”.)

A future dominated by AI could emerge through a very gradual process that occurs over many decades or even hundreds of years (conditional on it ever happening). And AI scenarios involving such gradual development could well be both highly likely and highly consequential.

An objection against focusing on such slow-growth scenarios might be that scenarios involving rapid change have higher stakes, and hence they are more worth prioritizing. But it is not clear to me why this should be the case. As I have noted elsewhere, a so-called value lock-in could also happen in a slow-growth scenario, and the probability of success — and of avoiding accidental harm — may well be higher in slow-growth scenarios (cf. “Which World Gets Saved”).

The upshot could thus be the very opposite, namely that it is ultimately more promising to focus on scenarios with relatively steady growth in AI capabilities and power. (I am not claiming that this focus is in fact more promising; my point is simply that it is not obvious and that there are good reasons to question a strong focus on fast-growth scenarios.)

Fast AI development does not imply concentrated AI development

Likewise, even if we grant that the pace of AI development will increase rapidly, it does not follow that this growth will be concentrated in a single (or a few) AI system(s), as opposed to being widely distributed, akin to an entire economy of machines that grow fast together. This issue of centralized versus distributed growth was in fact the main point of contention in the Hanson-Yudkowsky FOOM debate; and I agree with Hanson that distributed growth is considerably more likely.

Similar to the argument outlined in the previous section, one could argue that there is a wager to focus on scenarios that entail highly concentrated growth over those that involve highly distributed growth, even if the latter may be more likely. Perhaps the main argument in favor of this view is that it seems that our impact can be much greater if we manage to influence a single system that will eventually gain power compared to if our influence is dispersed across countless systems.

Yet I think there are good reasons to doubt that argument. One reason is that the strategy of influencing such a single AI system may require us to identify that system in advance, which might be a difficult bet that we could easily get wrong. In other words, our expected influence may be greatly reduced by the risk that we are wrong about which systems are most likely to gain power. Moreover, there might be similar and ultimately more promising levers for “concentrated influence” in scenarios that involve more distributed growth and power. Such levers may include formal institutions and societal values, both of which could exert a significant influence on the decisions of a large number of agents simultaneously — by affecting the norms, laws, and social equilibria under which they interact.

“A future dominated by AI” does not mean that either “technical AI safety” or “AI governance” is most promising

Another impression I have is that we sometimes tacitly assume that work on “avoiding bad AI outcomes” will fall either in the categories of “technical AI safety” or “AI governance”, or at least that it will mostly fall within these categories. But I do not think that this is the case, partly for the reasons alluded to above.

In particular, it seems to me that we sometimes assume that the aim of influencing “AI outcomes” is necessarily best pursued in ways that pertain quite directly to AI today. Yet why should we assume this to be the case? After all, it seems that there are many plausible alternatives.

For example, one could think that it is generally better to pursue broad investments so as to build flexible resources that make us better able to tackle these problems down the line — e.g. investments toward general movement building and toward increasing the amount of money that we will be able to spend later, when we might be better informed and have better opportunities to pursue direct work.

A complementary option is to focus on the broader contextual factors hinted at in the previous section. That is, rather than focusing primarily on the design of the AI systems themselves, or on the laws that directly govern their development, one may focus on influencing the wider context in which they will be developed and deployed — e.g. general values, institutions, diplomatic relations, collective knowledge and wisdom, etc. After all, the broader context in which AI systems will be developed and put into action could well prove critical to the outcomes that future AI systems will eventually create.

Note that I am by no means saying that work on technical AI safety or AI governance is not worth pursuing. My point is merely that these other strategies focused on building flexible resources and influencing broader contextual factors should not be overlooked as ways to influence “a future dominated by AI”. Indeed, I believe that these strategies are among the most promising ways in which we can have a beneficial such influence at this point.

Concluding clarification

On a final note, I should clarify that the main conceptual points I have been trying to make in this post likely do not contradict the explicitly endorsed views of anyone who works to reduce risks from AI. The objects of my concern are more (what I perceive to be) certain implicit models and commonly employed terminologies that I worry may distort how we think and talk about these issues.

Specifically, it seems to me that there might be a sort of collective availability heuristic at work, through which we continually boost the salience of a particular AI narrative — or a certain class of AI scenarios — along with a certain terminology that has come to be associated with that narrative (e.g. ‘AI takeoff’, ‘transformative AI’, etc). Yet if we change our assumptions a bit, or replace the most salient narrative with another plausible one, we might find that this terminology does not necessarily make a lot of sense anymore. We might find that our typical ways of thinking about AI outcomes may be resting on a lot of implicit assumptions that are more questionable and more narrow than we tend to realize.

Some reasons not to expect a growth explosion

Many people expect global economic growth to accelerate in the future, with growth rates that are not just significantly higher than those of today, but orders of magnitude higher.

The following are some of the main reasons I do not consider a growth explosion to be the most likely future outcome.


Contents

  1. Most economists do not expect a growth explosion
  2. The history of economic growth does not support a growth explosion
  3. Rates of innovation and progress in science have slowed down
  4. Moore’s law is coming to an end
  5. The growth of supercomputers has been slowing down for years
  6. Many of our technologies cannot get orders of magnitude more efficient
  7. Three objections in brief

Most economists do not expect a growth explosion

Estimates of the future of economic growth from economists themselves generally predict a continual decline in growth rates. For instance, one “review of publicly available projections of GDP per capita over long time horizons” concluded that growth will most likely continue to decline in most countries in the coming decades. A similar report from PWC came up with similar projections.

Some accessible books that explore economic growth in the past and explain why it is reasonable to expect stagnant growth rates in the future include Robert J. Gordon’s Rise and Fall of American Growth (short version) and Tyler Cowen’s The Great Stagnation (synopsis).

It is true that there are some economists who expect growth rates to be several orders of magnitude higher in the future, but these are generally outliers. Robin Hanson suggests that such a growth explosion is likely in his book The Age of Em, which, to give some context, fellow economist Bryan Caplan calls “the single craziest claim” of the book. Caplan further writes that Hanson’s arguments for such growth expectations were “astoundingly weak”.

The point here is not that the general opinion of economists is by any means a decisive reason to reject a growth explosion (as the most likely outcome). The point is merely that it represents a significant reason to doubt an imminent growth explosion, and that it is not in fact those who doubt a rapid rise in growth rates who are the consensus-defying contrarians (and in terms of imminence, it is worth noting that even Robin Hanson does not expect a growth explosion within the next couple of decades).

Rates of innovation and progress in science have slowed down

See Bloom et al.’s Are Ideas Getting Harder to Find? and Cowen & Southwood’s Is the rate of scientific progress slowing down? A couple of graphs from the latter:

Moore’s law is coming to an end

One of the main reasons to expect a growth acceleration in the future is the promise of information technology. And economists, including Gordon and Cowen mentioned above, indeed agree that information technology has been a key driver of the growth we have seen in recent decades. But the problem is that we have strong theoretical reasons to expect the underlying trend that has been driving most progress in information technology since the 1960s — i.e. Moore’s law — will be coming to an end within the next few years.

And while it may be that other hardware paradigms will replace silicon chips as we know them, and continue the by now familiar growth in information technology, we must admit that it is quite unclear whether this will happen, especially since we are already lacking noticeably behind this trend line.

One may object that this is just a matter of hardware, and that the real growth in information technology lies in software. But a problem with this claim is that, empirically, growth in software seems largely determined by growth in hardware.

The growth of supercomputers has been slowing down for years

Developments of the performance of the 500 fastest supercomputers in the world conform well to the pattern we should expect given that we are nearing the end of Moore’s law:


The 500th fastest supercomputer in the world was on a clear exponential trajectory from the early 1990s to 2010, after which growth in performance has been steadily declining. Roughly the same holds true of both the fastest supercomputer and the sum of the 500 fastest supercomputers: a clear exponential trajectory from the early 1990s to around 2013, after which the performance has been diverging ever further from the previous trajectory, in fact so much so that the performance of the sum of the 500 fastest supercomputers is now below the performance we should expect the single fastest supercomputer to have today based on 1993-2013 extrapolation.

Many of our technologies cannot get orders of magnitude more efficient

This point is perhaps most elaborately explored in Robert J. Gordon’s book mentioned above: it seems that we have already reaped much of the low-hanging fruit in terms of technological innovation, and in some respects it is impossible to improve things much further.

Energy efficiency is an obvious example, as many of our machines and energy harvesting technologies have already reached a significant fraction of the maximally possible efficiency. For instance, electric pumps and motors tend to have around 90 percent energy efficiency, while the efficiency of the best solar panels are above 40 percent. Many of our technologies thus cannot be made orders of magnitude more efficient, and many of them can at most be marginally improved, simply because they have reached the ceiling of hard physical limits.

Three objections in brief

#1. What about the exponential growth in the compute of the largest AI training runs from 2012-2018?

This is indeed a data point in the other direction. Note, however, that this growth does not appear to have continued after 2018. Moreover, much of this growth seems to have been unsustainable. For example, DeepMind lost more than a billion dollars in 2016-2018, with the loss getting greater each year: “$154 million in 2016, $341 million in 2017, $572 million in 2018”. And the loss was apparently even greater in 2019.

#2. What about the Open Philanthropy post in which David Roodman presented a diffusion model of future growth that predicted much higher growth rates?

I think that model overlooks most of the points made above. Second, I think the following figure from Roodman’s article is a strong indication about the fit of the model, particularly how the growth rates in 1600-1970 are virtually all in the high percentiles of the model, while the growth rates in 1980-2019 are all in the low percentiles, and generally in a lower percentile as time progresses. That is a strong sign that the model does not capture our actual trajectory, and that the fit is getting worse as time progresses.

BernouDiffPredGWP12KDecBlog.png

#3. We have a wager to give much more weight to high-growth scenarios.

First, I think it is questionable that scenarios with higher growth rates merit greater priority (e.g. a so-called value lock-in could also emerge in slow-growth scenarios, and it may be more feasible to influence slow-growth scenarios because they give us more time to acquire the requisite insights and resources to exert a significant and robustly positive influence). And it is less clear still that scenarios with higher growth merit much greater priority than scenarios with lower growth rates. But even if we grant that high-growth scenarios do merit greater priority, this should not change the bare epistemic credence we assign different scenarios. Our descriptive picture should not be distorted by such priority claims.

Effective altruism and common sense

Thomas Sowell once called Milton Friedman “one of those rare thinkers who had both genius and common sense”.

I am not here interested in Sowell’s claim about Friedman, but rather in his insight into the tension between abstract smarts and common sense, and particularly how it applies to the effective altruism (EA) community. For it seems to me that there sometimes is an unbalanced ratio of clever abstractions to common sense in EA discussions.

To be clear, my point is not that abstract ideas are unimportant, or even that everyday common sense should generally be favored over abstract ideas. After all, many of the core ideas of effective altruism are highly abstract in nature, such as impartiality and the importance of numbers, and I believe we are right to stand by these ideas. But my point is that common sense is underutilized as a sanity check that can prevent our abstractions from floating into the clouds. More generally, I seem to observe a tendency to make certain assumptions, and to do a lot of clever analysis and deductions based on those assumptions, but without spending anywhere near as much energy exploring the plausibility of these assumptions themselves.

Below are three examples that I think follow this pattern.

Boltzmann brains

A highly abstract idea that is admittedly intriguing to ponder is that of a Boltzmann brain: a hypothetical conscious brain that arises as the product of random quantum fluctuations. Boltzmann brains are a trivial corollary given certain assumptions: let some basic combinatorial assumptions hold for a set amount of time, and we can conclude that a lot of Boltzmann brains must exist in this span of time (at least as a matter of statistical certainty, similar to how we can derive and be certain of the second law of thermodynamics).

But this does not mean that Boltzmann brains are in fact possible, as the underlying assumptions may well be false. Beyond the obvious possibility that the lifetime of the universe could be too short, it is also conceivable that the combinatorial assumptions that allow a functioning 310 K human brain to emerge in ~ 0 K empty space do not in fact obtain, e.g. because it falsely assumes a combinatorial independence concerning the fluctuations that happen in each neighboring “bit” of the universe (or for some other reason). If any such key assumption is false, it could be that the emergence of a 310 K human brain in ~ 0 K space is not in fact allowed by the laws of physics, even in principle, meaning that even an infinite amount of time would never spontaneously produce a 310 K human Boltzmann brain.

Note that I am not claiming that Boltzmann brains cannot emerge in ~ 0 K space. My claim is simply that there is a big step from abstract assumptions to actual reality, and there is considerable uncertainty about whether the starting assumptions in question can indeed survive that step.

Quantum immortality

Another example is the notion of quantum immortality — not in the sense of merely surviving an attempted quantum suicide for improbably long, but in the sense of literal immortality because a tiny fraction of Everett branches continue to support a conscious survivor indefinitely.

This is a case where I think skeptical common sense and a search for erroneous assumptions is essential. Specifically, even granting a picture in which, say, a victim of a serious accident survives for a markedly longer time in one branch than in another, there are still strong reasons to doubt that there will be any branches in which the victim will survive for long. Specifically, we have good reason to believe that the measure of branches in which the victim survives will converge rapidly toward zero.

An objection might be that the measure indeed will converge toward zero, but that it never actually reaches zero, and hence there will in fact always be a tiny fraction of branches in which the victim survives. Yet I believe this rests on a false assumption. Our understanding of physics suggests that there is only — and could only be — a finite number of distinct branches, meaning that even if the measure of branches in which the victim survives is approximated well by a continuous function that never exactly reaches zero, the critical threshold that corresponds to a zero measure of actual branches with a surviving victim will in fact be reached, and probably rather quickly.

Of course, one may argue that we should still assign some probability to quantum immortality being possible, and that this possibility is still highly relevant in expectation. But I think there are many risks that are much less Pascallian and far more worthy of our attention.

Intelligence explosion

Unlike the two previous examples, this last example has become quite an influential idea in EA: the notion of a fast and local “intelligence explosion“.

I will not here restate my lengthy critiques of the plausibility of this notion (or the critiques advanced by others). And to be clear, I do not think the effective altruism community is at all wrong to have a strong focus on AI. But the mistake I think I do see is that there are many abstractly grounded assumptions pertaining to a hypothetical intelligence explosion that have received an insufficient amount of scrutiny from common sense and empirical data (Garfinkel, 2018 argues along similar lines).

I think part of the problem stems from the fact that Nick Bostrom’s book Superintelligence framed the future of AI in a certain way. Here, for instance, is how Bostrom frames the issue in the conclusion of his book (p. 319):

Before the prospect of an intelligence explosion, we humans are like small children playing with a bomb. … We have little idea when the detonation will occur, though if we hold the device to our ear we can hear a faint ticking sound. … Some little idiot is bound to press the ignite button just to see what happens.

I realize Bostrom is employing a metaphor here, and I realize that he assigns a substantial credence to many different future scenarios. But the way his book is framed is nonetheless mostly in terms of such a metaphorical bomb that could ignite an intelligence explosion (i.e. FOOM). And it seems that this kind of scenario in effect became the standard scenario many people assumed and worked on, with comparatively little effort going into the more fundamental question of how plausible this future scenario is in the first place. An abstract argument about (a rather vague notion of) “intelligence” recursively improving itself was given much weight, and much clever analysis focusing on this FOOM picture and its canonical problems followed.

Again, my claim here is not that this picture is wrong or implausible, but rather that the more fundamental questions about the nature and future of “intelligence” should be kept more alive, and that our approach to these questions should be more informed by empirical data, lest we misprioritize our resources.


In sum, our fondness for abstractions is plausibly a bias we need to control for. We can do this by applying common-sense heuristics to a greater extent, by spending more time considering how our abstract models might be wrong, and by making a greater effort to hold our assumptions up against empirical reality.

Two biases relevant to expected AI scenarios

My aim in this essay is to briefly review two plausible biases in relation to our expectations of future AI scenarios. In particular, these are biases that I think risk increasing our estimates of the probability of a local, so-called FOOM takeoff.

An important point to clarify from the outset is that these biases, if indeed real, do not in themselves represent reasons to simply dismiss FOOM scenarios. It would clearly be a mistake to think so. But they do, I submit, constitute reasons to be somewhat more skeptical of them, and to re-examine our beliefs regarding FOOM scenarios. (Stronger, more direct reasons to doubt FOOM have been reviewed elsewhere.)

Egalitarian intuitions looking for upstarts

The first putative bias has its roots in our egalitarian origins. As Christopher Boehm argues in his Hierarchy in the Forrest, we humans evolved in egalitarian tribes in which we created reverse dominance hierarchies to prevent domineering individuals from taking over. Boehm thus suggests that our minds are built to be acutely aware of the potential for any individual to rise and take over, perhaps even to the extent that we have specialized modules whose main task is to be attuned to this risk.

Western “Great Man” intuitions

The second putative bias is much more culturally contingent, and should be expected to be most pronounced in Western (“WEIRD“) minds. As Joe Henrich shows in his book The WEIRDest People in the World, Western minds are uniquely focused on individuals, so much so that their entire way of thinking about the world tends to revolve around individuals and individual properties (as opposed to thinking in terms of collectives and networks, which is more common among East Asian cultures).

The problem is that this Western, individualist mode of thinking, when applied straightforwardly to the dynamics of large-scale societies, is quite wrong. For while it may be mnemonically pragmatic to recount history, including the history of ideas and technology, in terms of individual actions and decisions, the truth is usually far more complex than this individualist narrative lets on. As Henrich argues, innovation is largely the product of large-scale systemic factors (such as the degree of connectedness between people), and these factors are usually far more important than is any individual, suggesting that Westerners tend to strongly overestimate the role that single individuals play in innovation and history more generally. Henrich thus alleges that the Western way of thinking about innovation reflects an “individualism bias” of sorts, and further notes that:

thinking about individuals and focusing on them as having dispositions and kind of always evaluating everybody [in terms of which] attributes they have … leads us to what’s called “the myth of the heroic inventor”, and that’s the idea that the great advances in technology and innovation are the products of individual minds that kind of just burst forth and give us these wonderful inventions. But if you look at the history of innovation, what you’ll find time after time was that there was lucky recombinations, people often invent stuff at the same time, and each individual only makes a small increment to a much larger, longer process.

In other words, innovation is the product of numerous small and piecemeal contributions to a much greater extent than Western “Great Man” storytelling suggests. (Of course, none of this is to say that individuals are unimportant, but merely that Westerners seem likely to vastly overestimate the influence that single individuals have on history and innovation.)

Upshot

If we have mental modules specialized to look for individuals that accumulate power and take control, and if we have expectations that roughly conform to this pattern in the context of future technology, with one individual entity innovating its way to a takeover, it seems that we should at least wonder whether this expectation may derive partly from our forager-age intuitions rather than resting purely on solid epistemics. Especially when this view of the future seems in strong tension with our actual understanding of innovation. This understanding being that innovation — contra Western intuition — is distributed, with increases in abilities generally the product of countless “small” insights and tools rather than a few big ones.

Both of the tendencies listed above lead us (or in the second case, mostly Westerners) to focus on individual agents rather than larger, systemic issues that may be crucial to future outcomes, yet which are less intuitively appealing for us to focus on. And there may well be more general explanations for this lack of appeal than just the two reasons listed above. The fact that there were no large-scale systemic issues of any kind for almost all of our species’ history renders it unsurprising that we are not particularly prone to focus on such issues (except for local signaling purposes).

Perhaps we need to control for this, and try to look more toward systemic issues than we are intuitively inclined to do. After all, the claim that the future will be dominated by AI systems in some form need not imply that the best way to influence that future is to focus on individual AI systems, as opposed to broader, institutional issues.

When Machines Improve Machines

The following is an excerpt from my book Reflections on Intelligence (2016/2020).

 

The term “Artificial General Intelligence” (AGI) refers to a machine that can perform any task at least as well as any human. This is often considered the holy grail of artificial intelligence research, and also the thing that many consider likely to give rise to an “intelligence explosion”, the reason being that machines then will be able to take over the design of smarter machines, and hence their further development will no longer be held back by the slowness of humans. Luke Muehlhauser and Anna Salamon express the idea in the following way:

Once human programmers build an AI with a better-than-human capacity for AI design, the instrumental goal for self-improvement may motivate a positive feedback loop of self-enhancement. Now when the machine intelligence improves itself, it improves the intelligence that does the improving.

(Muehlhauser & Salamon, 2012, p. 13)

This seems like a radical shift, yet is it really? As author and software engineer Ramez Naam has pointed out (Naam, 2010), not quite, since we already use our latest technology to improve on itself and build the next generation of technology. As I argued in the previous chapter, the way new tools are built and improved is by means of an enormous conglomerate of tools, and newly developed tools merely become an addition to this existing set of tools. In Naam’s words:

[A] common assertion is that the advent of greater-than-human intelligence will herald The Singularity. These super intelligences will be able to advance science and technology faster than unaugmented humans can. They’ll be able to understand things that baseline humans can’t. And perhaps most importantly, they’ll be able to use their superior intellectual powers to improve on themselves, leading to an upward spiral of self improvement with faster and faster cycles each time.

In reality, we already have greater-than-human intelligences. They’re all around us. And indeed, they drive forward the frontiers of science and technology in ways that unaugmented individual humans can’t.

These superhuman intelligences are the distributed intelligences formed of humans, collaborating with one another, often via electronic means, and almost invariably with support from software systems and vast online repositories of knowledge.

(Naam, 2010)

The design and construction of new machines is not the product of human ingenuity alone, but of a large system of advanced tools in which human ingenuity is just one component, albeit a component that plays many roles. And these roles, it must be emphasized, go way beyond mere software engineering – they include everything from finding ways to drill and transport oil more effectively, to coordinating sales and business agreements across countless industries.

Moreover, as Naam hints, superhuman intellectual abilities already play a crucial role in this design process. For example, computer programs make illustrations and calculations that no human could possibly make, and these have become indispensable components in the design of new tools in virtually all technological domains. In this way, superhuman intellectual abilities are already a significant part of the process of building superhuman intellectual abilities. This has led to continued growth, yet hardly an intelligence explosion.

Naam gives a specific example of an existing self-improving “superintelligence” (a “super” goal achiever, that is), namely Intel:

Intel employs giant teams of humans and computers to design the next generation of its microprocessors. Faster chips mean that the computers it uses in the design become more powerful. More powerful computers mean that Intel can do more sophisticated simulations, that its CAD (computer aided design) software can take more of the burden off of the many hundreds of humans working on each chip design, and so on. There’s a direct feedback loop between Intel’s output and its own capabilities. …

Self-improving superintelligences have changed our lives tremendously, of course. But they don’t seem to have spiraled into a hard takeoff towards “singularity”. On a percentage basis, Google’s growth in revenue, in employees, and in servers have all slowed over time. It’s still a rapidly growing company, but that growth rate is slowly decelerating, not accelerating. The same is true of Intel and of the bulk of tech companies that have achieved a reasonable size. Larger typically means slower growing.

My point here is that neither superintelligence nor the ability to improve or augment oneself always lead to runaway growth. Positive feedback loops are a tremendously powerful force, but in nature (and here I’m liberally including corporate structures and the worldwide market economy in general as part of ‘nature’) negative feedback loops come into play as well, and tend to put brakes on growth.

(Naam, 2010)

I quote Naam at length here because he makes this important point well, and because he is an expert with experience in the pursuit of using technology to make better technology. In addition to Naam’s point about Intel and other companies that improve themselves, I would add that although these are enormous competent collectives, they still only constitute a tiny part of the larger collective system that is the world economy that they contribute modestly to, and which they are entirely dependent upon.

“The” AI?

The discussion above hints at a deeper problem in the scenario Muelhauser and Salomon lay out, namely the idea that we will build an AI that will be a game-changer. This idea seems widespread in modern discussions about both risks and opportunities of AI. Yet why should this be the case? Why should the most powerful software competences we develop in the future be concentrated into anything remotely like a unitary system?

The human mind is unitary and trapped inside a single skull for evolutionary reasons. The only way additional cognitive competences could be added was by lumping them onto the existing core in gradual steps. But why should the extended “mind” of software that we build to expand our capabilities be bound in such a manner? In terms of the current and past trends of the development of this “mind”, it only seems to be developing in the opposite direction: toward diversity, not unity. The pattern of distributed specialization mentioned in the previous chapter is repeating itself in this area as well. What we see is many diverse systems used by many diverse systems in a complex interplay to create ever more, increasingly diverse systems. We do not appear to be headed toward any singular super-powerful system, but instead toward an increasingly powerful society of systems (Kelly, 2010).

Greater Than Individual or Collective Human Abilities?

This also hints at another way in which our speaking of “intelligent machines” is somewhat deceptive and arbitrary. For why talk about the point at which these machines become as capable as human individuals rather than, say, an entire human society? After all, it is not at the level of individuals that accomplishments such as machine building occurs, but rather at the level of the entire economy. If we talked about the latter, it would be clear to us, I think, that the capabilities that are relevant for the accomplishment of any real-world goal are many and incredibly diverse, and that they are much more than just intellectual: they also require mechanical abilities and a vast array of materials.

If we talked about “the moment” when machines can do everything a society can, we would hardly be tempted to think of these machines as being singular in kind. Instead, we would probably think of them as a society of sorts, one that must evolve and adapt gradually. And I see no reason why we should not think about the emergence of “intelligent machines” with abilities that surpass human intellectual abilities in the same way.

After all, this is exactly what we see today: we gradually build new machines – both software and hardware – that can do things better than human individuals, but these are different machines that do different things better than humans. Again, there is no trend toward the building of disproportionally powerful, unitary machines. Yes, we do see some algorithms that are impressively general in nature, but their generality and capabilities still pale in comparison to the generality and the capabilities of our larger collective of ever more diverse tools (as is also true of individual humans).

Relatedly, the idea of a “moment” or “event” at which machines surpass human abilities is deeply problematic in the first place. It ignores the many-faceted nature of the capabilities to be surpassed, both in the case of human individuals and human societies, and, by extension, the gradual nature of the surpassing of these abilities. Machines have been better than humans at many tasks for centuries, yet we continue to speak as though there will be something like a “from-nothing-to-everything” moment – e.g. “once human programmers build an AI with a better-than-human capacity for AI design”. Again, this is not congruous with the way in which we actually develop software: we already have software that is superhuman in many regards, and this software already plays a large role in the collective system that builds smarter machines.

A Familiar Dynamic

It has always been the latest, most advanced tools that, in combination with the already existing set of tools, have collaborated to build the latest, most advanced tools. The expected “machines building machines” revolution is therefore not as revolutionary as it seems at first sight. The “once machines can program AI better than humans” argument seems to assume that human software engineers are the sole bottleneck of progress in the building of more competent machines, yet this is not the case. But even if it were, and if we suddenly had a thousand times as many people working to create better software, other bottlenecks would quickly emerge – materials, hardware production, energy, etc. All of these things, indeed the whole host of tasks that maintain and grow our economy, are crucial for the building of more capable machines. Essentially, we are returned to the task of advancing our entire economy, something that pretty much all humans and machines are participating in already, knowingly or not, willingly or not.

By themselves, the latest, most advanced tools do not do much. A CAD program alone is not going to build much, and the same holds true of the entire software industry. In spite of all its impressive feats, it is still just another cog in a much grander machinery.

Indeed, to say that software alone can lead to an “intelligence explosion” – i.e. a capability explosion – is akin to saying that a neuron can hold a conversation. Such statements express a fundamental misunderstanding of the level at which these accomplishments are made. The software industry, like any software program in particular, relies on the larger economy in order to produce progress of any kind, and the only way it can do so is by becoming part of – i.e. working with and contributing to – this grander system that is the entire economy. Again, individual goal-achieving ability is a function of the abilities of the collective. And it is here, in the entire economy, that the greatest goal-achieving ability is found, or rather distributed.

The question concerning whether “intelligence” can explode is therefore essentially: can the economy explode? To which we can answer that rapid increases in the growth rate of the world economy certainly have occurred in the past, and some argue that this is likely to happen again in the future (Hanson 1998/2000, 2016). However, there are reasons to be skeptical of such a future growth explosion (Murphy, 2011; Modis, 2012; Gordon, 2016; Caplan, 2016; Vinding, 2017b; Cowen & Southwood, 2019).

“Intelligence Though!” – A Bad Argument

A type of argument often made in discussions about the future of AI is that we can just never know what a “superintelligent machine” could do. “It” might be able to do virtually anything we can think of, and much more than that, given “its” vastly greater “intelligence”.

The problem with this argument is that it again rests on a vague notion of “intelligence” that this machine “has a lot of”. For what exactly is this “stuff” it has a lot of? Goal-achieving ability? If so, then, as we saw in the previous chapter, “intelligence” requires an enormous array of tools and tricks that entails much more than mere software. It cannot be condensed into anything we can identify as a single machine.

Claims of the sort that a “superintelligent machine” could just do this or that complex task are extremely vague, since the nature of this “superintelligent machine” is not accounted for, and neither are the plausible means by which “it” will accomplish the extraordinarily difficult – perhaps even impossible – task in question. Yet such claims are generally taken quite seriously nonetheless, the reason being that the vague notion of “intelligence” that they rest upon is taken seriously in the first place. This, I have tried to argue, is the cardinal mistake.

We cannot let a term like “superintelligence” provide a carte blanche to make extraordinary claims or assumptions without a bare minimum of justification. I think Bostrom’s book Superintelligence is an example of this. Bostrom worries about a rapid “intelligence explosion” initiated by “an AI” throughout the book, yet offers very little in terms of arguments for why we should believe that such a rapid explosion is plausible (Hanson, 2014), not to mention what exactly it is that is supposed to explode (Hanson, 2010; 2011a).

No Singular Thing, No Grand Control Problem

The problem is that we talk about “intelligence” as though it were a singular thing; or, in the words of brain and AI researcher Jeff Hawkins, as though it were “some sort of magic sauce” (Hawkins, 2015). This is also what gives rise to the idea that “intelligence” can explode, because one of the things that this “intelligence” can do, if you have enough of it, is to produce more “intelligence”, which can in turn produce even more “intelligence”.

This stands in stark contrast to the view that “intelligence” – whether we talk about cognitive abilities in particular or goal-achieving abilities in general – is anything but singular in nature, but rather the product of countless clever tricks and hacks built by a long process of testing and learning. On this latter view, there is no single master problem to crack for increasing “intelligence”, but rather just many new tricks and hacks we can discover. And finding these is essentially what we have always been doing in science and engineering.

Robin Hanson makes a similar point in relation to his skepticism of a “blank-slate AI mind-design” intelligence explosion:

Sure if there were a super mind theory that allowed vast mental efficiency gains all at once, but there isn’t. Minds are vast complex structures full of parts that depend intricately on each other, much like the citizens of a city. Minds, like cities, best improve gradually, because you just never know enough to manage a vast redesign of something with such complex inter-dependent adaptations.

(Hanson, 2010)

Rather than a concentrated center of capability that faces a grand control problem, what we see is a development of tools and abilities that are distributed throughout the larger economy. And we “control” – i.e. specify the function of – these tools, including software programs, gradually as we make them and put them to use in practice. The design of the larger system is thus the result of our solutions to many, comparatively small “control problems”. I see no compelling reason to believe that the design of the future will be any different.


See also Chimps, Humans, and AI: A Deceptive Analogy.

Consciousness – Orthogonal or Crucial?

The following is an excerpt from my book Reflections on Intelligence (2016/2020).

 

A question often considered open, sometimes even irrelevant, when it comes to “AGIs” and “superintelligences” is whether such entities would be conscious. Here is Nick Bostrom expressing such a sentiment:

By a “superintelligence” we mean an intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills. This definition leaves open how the superintelligence is implemented: it could be a digital computer, an ensemble of networked computers, cultured cortical tissue or what have you. It also leaves open whether the superintelligence is conscious and has subjective experiences.

(Bostrom, 2012, “Definition of ‘superintelligence’”)

This is false, however. On no meaningful definition of “more capable than the best human brains in practically every field, including scientific creativity, general wisdom, and social skills” can the question of consciousness be considered irrelevant. This is like defining a “superintelligence” as an entity “smarter” than any human, and to then claim that this definition leaves open whether such an entity can read natural language or perform mathematical calculations. Consciousness is integral to virtually everything we do and excel at, and thus if an entity is not conscious, it cannot possibly outperform the best humans “in practically every field”. Especially not in “scientific creativity, general wisdom, and social skills”. Let us look at these three in turn.

Social Skills

Good social skills depend on an ability to understand others. And in order to understand other people, we have to simulate what it is like to be them. Fortunately, this comes quite naturally to most of us. We know what it is like to consciously experience emotions such as sadness, fear, and joy directly, and this enables us to understand where people are coming from when they report and act on these emotions.

Consider the following example: without knowing anything about a stranger you observe on the street, you can roughly know how that person would feel and react if they suddenly, by the snap of a finger, had no clothes on right there on the street. Embarrassment, distress, wanting to cover up and get away from the situation are almost certain to be the reaction of any randomly selected person. We know this, not because we have read about it, but because of our immediate simulations of the minds of others – one of the main things our big brains evolved to do. This is what enables us to understand the minds of other people, and hence without running this conscious simulation of the minds of others, one will have no chance of gaining good social skills and interpersonal understanding.

But couldn’t a computer just simulate people’s brains and then understand them without being conscious? Is the consciousness bit really relevant here?

Yes, consciousness is relevant. At the very least, it is relevant for us. Consider, for instance, the job of a therapist, or indeed the “job” of any person who attempts to listen to another person in a deep conversation. When we tell someone about our own state or situation, it matters deeply to us that the listener actually understands what we are saying. A listener who merely pretends to feel and understand would be no good. Indeed, this would be worse than no good, as such a “listener” would then essentially be lying and deceiving in a most insensitive way, in every sense of the word.

Frustrated Human: “Do you actually know the feeling I’m talking about here? Do you even know the difference between joy and hopeless despair?”

Unconscious liar: “Yes.”

Whether someone is actually feeling us when we tell them something matters to us, especially when it comes to our willingness to share our perspectives, and hence it matters for “social skills”. An unconscious entity cannot have better social skills than “the best human brains” because it would lack the very essence of social skills: truly feeling and understanding others. Without a conscious mind there is no way to understand what it is like to have such a mind.

General Wisdom

Given how relevant social skills are for general wisdom, and given the relevance of consciousness for social skills, the claim that consciousness is irrelevant to general wisdom should already stand in serious doubt at this point.

Yet rather than restricting our focus to “general wisdom”, let us consider ethics in its entirety, which, broadly construed at least, includes any relevant sense of “general wisdom”. For in order to reason about ethics, one must be able to consider and evaluate questions like the following:

Can certain forms of suffering be outweighed by a certain amount of happiness?

Does the nature of the experience of suffering in some sense demand that reducing suffering is given greater moral priority than increasing happiness (for the already happy)?

Can realist normative claims be made on the basis of the properties of such experiences?

One has to be conscious to answer such questions. That is, one must know what such experiences are like in order to understand their experiential properties and significance. Knowing what terms like “suffering” and “happiness” refer to – i.e. knowing what the actual experiences of suffering and happiness are like – is as crucial to ethics as numbers are to mathematics.

The same point holds true about other areas of philosophy that bear on wisdom, such as the philosophy of mind: without knowing what it is like to have a conscious mind, one cannot contribute to the discussion about what it is like to have one and what the nature of consciousness is. Indeed, an unconscious entity has no idea about what the issue is even about in the first place.

So both in ethics and in the philosophy of mind, an unconscious entity would be less than clueless about the deep questions at hand. If an entity not only fails to surpass humans in this area, but fails to even have the slightest clue about what we are talking about, it hardly surpasses the best human brains in practically every field. After all, these questions are also relevant to many other fields, ranging from questions in psychology to questions concerning the core foundations of knowledge.

Experiencing and reasoning about consciousness is a most essential part of “human abilities”, and hence an entity that cannot do this cannot be claimed to surpass humans in the most important, much less all, human abilities.

Scientific Creativity

The third and final ability mentioned above that an unconscious entity can supposedly surpass humans in is scientific creativity. Yet scientific creativity must relate to all fields of knowledge, including the science of the conscious mind itself. This is also a part of the natural world, and a most relevant one at that.

Experiencing and accurately reporting what a given state of consciousness is like is essential for the science of mind, yet an unconscious entity obviously cannot do such a thing, as there is no experience it can report from. It cannot display any scientific creativity, or even produce mere observations, in this most important science. Again, the most it can do is produce lies – the very anti-matter of science.

 

Chimps, Humans, and AI: A Deceptive Analogy

The prospect of smarter-than-human artificial intelligence (AI) is often presented and thought of in terms of a simple analogy: AI will stand in relation to us the way we stand in relation to chimps. In other words, AI will be qualitatively more competent and powerful than us, and its actions will be as inscrutable to humans as current human endeavors (e.g. science and politics) are to chimps.

My aim in this essay is to show that this is in many ways a false analogy. The difference in understanding and technological competence found between modern humans and chimps is, in an important sense, a zero-to-one difference that cannot be repeated.


Contents

  1. How are humans different from chimps?
    1. I. Symbolic language
    2. II. Cumulative technological innovation
  2. The range of human abilities is surprisingly wide
  3. The cultural basis of the human capability expansion
  4. Why this is relevant

How are humans different from chimps?

A common answer to this question is that humans are smarter. Specifically, at the level of our individual cognitive abilities, humans, with our roughly three times larger brains, are just far more capable.

This claim no doubt contains a large grain of truth, as humans surely do beat chimps in a wide range of cognitive tasks. Yet it is also false in some respects. For example, chimps have superior working memory compared to humans, and apparently also beat humans in certain video games, including games involving navigation in complex mazes.

But researchers who study human uniqueness actually provide some rather different, more specific answers to this question. If we focus on individual mental differences in particular, researchers have found that, crudely speaking, humans are different from chimps in three principal ways: 1) we can learn language, 2) we have a strong orientation toward social learning, and 3) we are highly cooperative (among our ingroup, compared to chimps).

These differences have in turn resulted in two qualitative differences in the abilities of humans and chimps in today’s world.

I. Symbolic language

The first is that we humans have acquired an ability to think and communicate in terms of symbolic language that represents elaborate concepts. We can learn about the deep history of life and the universe, as well as the likely future of the universe, including the fundamental limits to future space travel and future computations. Any educated human can learn a good deal about these things whereas no chimp can.

Note how this is truly a zero-to-one difference: no symbolic language versus an elaborate symbolic language through which knowledge can be represented and continually developed (see chapter 1 in Deacon, 1997). It is the difference between having no science of physics versus having an elaborate such science with which we can predict future events and put hard limits on future possibilities.

This zero-to-one difference cannot really be repeated. Given that we already have physical models that predict, say, the future motion of planets and the solar system to a fairly high degree of accuracy, the best one can do in this respect is to (slightly) improve the accuracy of these predictions. Such further improvements cannot be compared to going from zero physics to current physics.

The same point applies to our scientific understanding more generally: we currently have theories that work decently well at explaining most of the phenomena around us. And though one can significantly improve the accuracy and sophistication of many of these theories, any such further improvement would be much less significant than the qualitative leap from absolutely no conceptual models to the entire collection of models and theories we currently have.

For example, going from no understanding of evolution by natural selection to the elaborate understanding of biology we have today cannot be matched, in terms of qualitative and revolutionary leaps, by further refinements in biology. We have already mapped out the core basics of biology (in fact a great deal more than that), and this can only be done once.

This is not an original point. Robin Hanson has made essentially the same point in response to the notion that future machines will be “as incomprehensible to us as we are to goldfish”:

This seems to me to ignore our rich multi-dimensional understanding of intelligence elaborated in our sciences of mind (computer science, AI, cognitive science, neuroscience, animal behavior, etc.).

… the ability of one mind to understand the general nature of another mind would seem mainly to depend on whether that first mind can understand abstractly at all, and on the depth and richness of its knowledge about minds in general. Goldfish do not understand us mainly because they seem incapable of any abstract comprehension. …

It seems to me that human cognition is general enough, and our sciences of mind mature enough, that we can understand much about quite a diverse zoo of possible minds, many of them much more capable than ourselves on many dimensions.

Ramez Naam has argued similarly in relation to the idea that there will be some future time or intelligence that we are fundamentally unable to understand. He argues that our understanding of the future is growing rather than shrinking as time progresses, and that AI and other future technologies will not be beyond comprehension:

All of those [future technologies] are still governed by the laws of physics. We can describe and model them through the tools of economics, game theory, evolutionary theory, and information theory. It may be that at some point humans or our descendants will have transformed the entire solar system into a living information processing entity — a Matrioshka Brain. We may have even done the same with the other hundred billion stars in our galaxy, or perhaps even spread to other galaxies.

Surely that is a scale beyond our ability to understand? Not particularly. I can use math to describe to you the limits on such an object, how much computing it would be able to do for the lifetime of the star it surrounded. I can describe the limit on the computing done by networks of multiple Matrioshka Brains by coming back to physics, and pointing out that there is a guaranteed latency in communication between stars, determined by the speed of light. I can turn to game theory and evolutionary theory to tell you that there will most likely be competition between different information patterns within such a computing entity, as its resources (however vast) are finite, and I can describe to you some of the dynamics of that competition and the existence of evolution, co-evolution, parasites, symbiotes, and other patterns we know exist.

Chimps cannot understand human politics and science to a similar extent. Thus, the truth is that there is a strong disanalogy between the understanding chimps have of humans versus the understanding that we humans — thanks to our conceptual tools — can have of any possible future intelligence (in physical and computational terms, say).

Note that the qualitative leap reviewed above was not one that happened shortly after human ancestors diverged from chimp ancestors. Instead, it was a much more recent leap that has been unfolding gradually since the first humans appeared, and which has continued to accelerate in recent centuries, as we have developed ever more advanced science and mathematics. In other words, this qualitative step has been a product of cultural evolution just as much as biological evolution. Early humans presumably had a roughly similar potential to learn modern language, science, mathematics, etc. But such conceptual tools could not be acquired in the absence of a surrounding culture able to teach these innovations.

Ramez Naam has made a similar point:

If there was ever a singularity in human history, it occurred when humans evolved complex symbolic reasoning, which enabled language and eventually mathematics and science. Homo sapiens before this point would have been totally incapable of understanding our lives today. We have a far greater ability to understand what might happen at some point 10 million years in the future than they would to understand what would happen a few tens of thousands of years in the future.

II. Cumulative technological innovation

The second zero-to-one difference between humans and chimps is that we humans build things. Not just that we build things, but that we refine our technology over time. After all, many non-human animals use tools in the form of sticks and stones, and some even shape primitive tools of their own. But only humans improve and build upon the technological inventions of their ancestors.

Consequently, humans are unique in expanding their abilities by systematically exploiting their environment, molding the things around them into ever more useful self-extensions. We have turned wildlands into crop fields; we have created technologies that can harvest energy — from oil, gas, wind, and sun — and we have built external memories far more reliable than our own, such as books and hard disks.

This is another qualitative leap that cannot be repeated: the step from having absolutely no cumulative technology to exploiting and optimizing our external environment toward our own ends. The step from having no external memory to having the current repository of stored human knowledge at our fingertips, and from harvesting absolutely no energy (other than through individual digestion) to collectively harvesting and using hundreds of quintillions of Joules every year.

To be sure, it is possible to improve on and expand these innovations. We can harvest greater amounts of energy, for example, and create even larger external memories. Yet these are merely quantitative differences, and humanity indeed continually makes such improvements each year. They are not zero-to-one differences that only a new species could bring about. And what is more, we know that the potential for making further technological improvements is, at least in many respects, quite limited.

Take energy efficiency as an example. Many of our machines and energy harvesting technologies have already reached a significant fraction of the maximally possible efficiency. For example, electric motors and pumps tend to have around 90 percent energy efficiency, and the best solar panels have an efficiency greater than 40 percent. So as a matter of hard physical limits, many of our technologies cannot be made orders of magnitude more efficient; in fact, a large number of them can at most be marginally improved.

In sum, we are unique in being the first species that systematically sculpted our surrounding environment and turned it into ever-improving tools, many of which have near-maximal efficiency. This step cannot be repeated, only expanded further.


Just like the qualitative leap in our symbolic reasoning skills, the qualitative leap in our ability to create technology and shape our environment emerged, not between chimps and early humans, but between early humans and today’s humans, as the result of a cultural process occurring over thousands of years. In fact, the two leaps have been closely related: our ability to reason and communicate symbolically has enabled us to create cumulative technological innovation. Conversely, our technologies have allowed us to refine our knowledge and conceptual tools, by enabling us to explore and experiment, which in turn made us able to build even better technologies with which we could advance our knowledge even further, and so on.

This, in a nutshell, is the story of the growth of human knowledge and technology, a story of recursive self-improvement (see Simler, 2019, “On scientific networks”). It is not really a story about the individual human brain per se. After all, the human brain does not accomplish much in isolation (nor is it the brain with the largest number of neurons; several species have more neurons in the forebrain). It is more a story about what happened between and around brains: in the exchange of information in networks of brains and in the external creations designed by them. A story made possible by the fact that the human brain is unique in being by far the most cultural brain of all, with its singular capacity to learn from and cooperate with others.

The range of human abilities is surprisingly wide

Another way in which an analogy to chimps is frequently drawn is by imagining an intelligence scale along which different species are ranked, such that, for example, we have “rats at 30, chimps at 60, the village idiot at 90, the average human at 98, and Einstein at 100”, and where future AI may in turn be ranked many hundreds of points higher than Einstein. According to this picture, it is not just that humans will stand in relation to AI the way chimps stand in relation to humans, but that AI will be far superior still. The human-chimp analogy is, on this view, a severe understatement of the difference between humans and future AI.

Such an intelligence scale may seem intuitively compelling, but how does it correspond to reality? One way to probe this question is to examine the range of human abilities in chess. The standard way to rank chess skills is with the Elo rating system, which is a good predictor of the outcomes of chess games between different players, whether human, digital, or otherwise.

An early human beginner will have a rating around 300, a novice around 800, and a rating in the range 2000-2199 is ranked as “Expert”. The highest rating ever achieved is 2882 by Magnus Carlsen.

How large is this range of chess skills in an absolute sense? Remarkably large, it turns out. For example, it took more than four decades from when computers were first able to beat a human chess novice (the 1950s), until a computer was able to beat the best human player (1997, officially). In other words, the span from novice to Kasparov corresponded to more than four decades of progress in hardware — i.e. a million times more computing power — and software. This alone suggests that the human range of chess skills is rather wide.

Yet the range seems even broader when we consider the upper bounds of chess performance. After all, the fact that it took computers decades to go from human novice to world champion does not mean that the best human is not still ridiculously far from the best a computer could be in theory. Surprisingly, however, this latter distance does in fact seem quite small. Estimates suggest that the best possible chess machine would have an Elo rating around 3600, which means that the relative distance between the best possible computer and the best human is only around 700 Elo points (the Elo rating is essentially a measure of relative distance; 700 Elo points corresponds to a winning percentage of around 1.5 percent for the losing player).

This implies that the distance between the best human (Carlsen) and a chess “Expert” (someone belonging to the top 5 percent of chess players) is similar to the distance between the best human and the best possible chess brain, while the distance between a human beginner and the best human is far greater (2500 Elo points). This stands in stark contrast to the intelligence scale outlined above, which would predict the complete opposite: the distance from a human novice to the best human should be comparatively small whereas the distance from the best human to the optimal brain should be the larger one by far.


It may be objected that chess is a bad example, and that it does not really reflect what is meant by the intelligence scale above. But the question is then what would be a better measure. After all, a similar story seems to apply to other games, such as shogi and go: the human range of abilities is surprisingly wide and the best players are significantly closer to optimal than they are to novice players.

In fact, one can argue that the objection should go in the opposite direction, as human brains are not built for chess, and hence we should expect even the best humans to be far from optimal at it. We should expect to be much closer to “optimal” at solving problems that are more important for our survival, such as social cognition and natural language processing — skills that most people are wired to master at super-Carlsen levels.

Regardless, the truth is that humans are mastering ever more “games”, literal as well as figurative ones, at optimal or near-optimal levels. Not because evolution “just so happened to stumble upon the most efficient way to assemble matter into an intelligent system”, but rather because it created a species able to make cultural and technological progress toward ever greater levels of competence.

The cultural basis of the human capability expansion

The intelligence scale outlined above misses two key points. First, human abilities are not a constant. Whether we speak of individual abilities (e.g. the abilities of elite chess players) or humanity’s collective abilities (e.g. building laptops and sending people to the moon), it is clear that our abilities have increased dramatically as our culture and technology have expanded.

Second, because human abilities are not a constant, the range of human abilities is far wider, in an absolute sense, than the intelligence scale outlined above suggests, as it has grown and still continues to grow over time.

Chess is a good example of this. Untrained humans and chimps have the same (non-)skill level at chess. Yet thanks to culture, some people can learn to master the game. A wealthy society can allow people to specialize in chess, and makes it possible for knowledge to accumulate in books and experts. Eventually, it enables learning from super-human chess engines, whose innovations we can adopt just as we do those of other humans.

And yet we humans expand our abilities to a much greater extent than the example of increased human chess abilities suggests, as we not only expand our abilities by stimulating our brains with progressively better forms of practice and information, but also by extending ourselves directly with technology. For example, we can all use a chess engine to find great chess moves for us. Our latest technologies enable us to accomplish ever-more tasks that no human could ever accomplish unaided.

Worth noting in this regard is that this self-extension process seems to have slowed down in recent decades, likely because we have reaped most low-hanging fruits already, and in some respects because it is impossible to improve things much further (we already mentioned energy efficiency as an example where we are getting close to the upper limits in many respects).

This suggests that not only is there not a qualitative leap similar to that between chimps and modern humans ahead of us, but that even a quantitative growth explosion, with relative growth rates significantly higher than what we have seen in the past, should not be our default expectation either (for some support for this claim, see “Peak growth might lie in the past” in Vinding, 2017).

Why this is relevant

The errors of the human-chimp analogy are worth highlighting for a few reasons. First, the analogy can lead us to overestimate how much everything will change with AI. It leads us to expect qualitative leaps of sorts that cannot be repeated.

Second, the human-chimp analogy makes us underestimate how much we currently know and are able to understand. To think that intelligent systems of the future will be as incomprehensible to us today as human affairs are to chimps is to underestimate how extensive and universal our current knowledge of the world in fact is — not just when it comes to physical and computational limits, but also in relation to general economic and game-theoretic principles. We know a good deal about economic growth, for example, and this knowledge has a lot to say about how we should expect future intelligent systems to grow. In particular, it suggests that local AI-FOOM growth is unlikely.

The analogy can thus have an insidious influence by making us feel like current data and trends cannot be trusted much, because look how different humans are from chimps, and look how puny the human brain is compared to ultimate limits. I think this is exactly the wrong way to think about the future. We should base our expectations on a deep study of past trends, including the actual evolution of human competences — not simple analogies.

Relatedly, the human-chimp analogy is also relevant in that it can lead us to grossly overestimate the probability of an AI-FOOM scenario. That is, if we get the story about the evolution of human competences so wrong that we think the differences we observe today between chimps and modern humans reduce mostly to a story about changes in individual brains, then we are likely to have similarly inaccurate expectations about what comparable innovations in some individual machine are able to effect on their own.

If the human-chimp analogy leads us to (marginally) overestimate the probability of a FOOM scenario, it may nudge us toward focusing too much on some single, concentrated future thing that we expect to be all-important: the AI that suddenly becomes qualitatively more competent than humans. In effect, the human-chimp analogy can lead us to neglect broader factors, such as cultural and institutional developments.

Note that the above is by no means a case for complacency about risks from AI. It is important that we get a clear picture of such risks, and that we allocate our resources accordingly. But this requires us to rely on accurate models of the world. If we overemphasize one set of risks, we are by necessity underemphasizing others.

The future of growth: Near-zero growth rates

First written: Jul. 2017; Last update: Nov 2022.

Exponential growth is a common pattern found throughout nature. Yet it is also a pattern that tends not to last, as growth rates tend to decline sooner or later.

In biology, this pattern of exponential growth that wanes off is found in everything from the development of individual bodies — for instance, in the growth of humans, which levels off in the late teenage years — to population sizes.

One may of course be skeptical that this general trend will also apply to the growth of our technology and economy at large, as innovation seems to continually postpone our clash with the ceiling, yet it seems inescapable that it must. For in light of what we know about physics, we can conclude that exponential growth of the kinds we see today, in technology in particular and in our economy more generally, must come to an end, and do so relatively soon.

Limits to growth

Physical limits to computation and Moore’s law

One reason we can make this assertion is that there are theoretical limits to computation. As physicist Seth Lloyd’s calculations show, a continuation of Moore’s law — in its most general formulation: “the amount of information that computers are capable of processing and the rate at which they process it doubles every two years” — would imply that we hit the theoretical limits of computation within 250 years:

If, as seems highly unlikely, it is possible to extrapolate the exponential progress of Moore’s law into the future, then it will only take two hundred and fifty years to make up the forty orders of magnitude in performance between current computers that perform 1010 operations per second on 1010 bits and our one kilogram ultimate laptop that performs 1051 operations per second on 1031 bits.

Similarly, physicists Lawrence Krauss and Glenn Starkman have calculated that, even if we factor in colonization of space at the speed of light, this doubling of processing power cannot continue for more than 600 years in any civilization:

Our estimate for the total information processing capability of any system in our Universe implies an ultimate limit on the processing capability of any system in the future, independent of its physical manifestation and implies that Moore’s Law cannot continue unabated for more than 600 years for any technological civilization.

In a more recent lecture and a subsequent interview, Krauss said that the absolute limit for the continuation of Moore’s law, in our case, would be reached in less than 400 years (the discrepancy — between the numbers 400 and 600 — is at least in part because Moore’s law, in its most general formulation, has played out for more than a century in our civilization at this point). And, as both Krauss and Lloyd have stressed, these are ultimate theoretical limits, resting on assumptions that are unlikely to be met in practice, such as expansion at the speed of light. What is possible, in terms of how long Moore’s law can continue for, given both engineering and economic constraints is likely significantly less. Indeed, we are already close to approaching the physical limits of the paradigm that Moore’s law has been riding on for more than 50 years — silicon transistors, the only paradigm that Gordon Moore was talking about originally — and it is not clear whether other paradigms will be able to take over and keep the trend going.

Limits to the growth of energy use

Physicist Tom Murphy has calculated a similar limit for the growth of the energy consumption of our civilization. Based on the observation that the energy consumption of the United States has increased fairly consistently with an average annual growth rate of 2.9 percent over the last 350 odd years (although the growth rate appears to have slowed down in recent times and been stably below 2.9 since c. 1980), Murphy proceeds to derive the limits for the continuation of similar energy growth. He does this, however, by assuming an annual growth rate of “only” 2.3 percent, which conveniently results in an increase of the total energy consumption by a factor of ten every 100 years. If we assume that we will continue expanding our energy use at this rate by covering Earth with solar panels, this would, on Murphy’s calculations, imply that we will have to cover all of Earth’s land with solar panels in less than 350 years, and all of Earth, including the oceans, in 400 years.

Beyond that, assuming that we could capture all of the energy from the sun by surrounding it in solar panels, the 2.3 percent growth rate would come to an end within 1,350 years from now. And if we go further out still, to capture the energy emitted from all the stars in our galaxy, we get that this growth rate must hit the ceiling and become near-zero within 2,500 years (of course, the limit of the physically possible must be hit earlier, indeed more than 500 years earlier, as we cannot traverse our 100,000 light year-wide Milky Way in only 2,500 years).

One may suggest that alternative sources of energy might change this analysis significantly, yet, as Murphy notes, this does not seem to be the case:

Some readers may be bothered by the foregoing focus on solar/stellar energy. If we’re dreaming big, let’s forget the wimpy solar energy constraints and adopt fusion. The abundance of deuterium in ordinary water would allow us to have a seemingly inexhaustible source of energy right here on Earth. We won’t go into a detailed analysis of this path, because we don’t have to. The merciless growth illustrated above means that in 1400 years from now, any source of energy we harness would have to outshine the sun.

Essentially, keeping up the annual growth rate of 2.3 percent by harnessing energy from matter not found in stars would force us to make such matter hotter than stars themselves. We would have to create new stars of sorts, and, even if we assume that the energy required to create such stars is less than the energy gained, such an endeavor would quickly run into limits as well. For according to one estimate, the total mass of the Milky Way, including dark matter, is only 20 times greater than the mass of its stars. Assuming a 5:1 ratio of dark matter to ordinary matter, this implies that that there is only about 3.3 times as much ordinary non-stellar matter as there is stellar matter in our galaxy. Thus, even if we could convert all this matter into stars without spending any energy and harvest the resulting energy, this would only give us about 50 years more of keeping up with the annual growth rate of 2.3 percent.1

Limits derived from economic considerations

Similar conclusions to the ones drawn above for computation and energy also seem to follow from calculations of a more economic nature. For, as economist Robin Hanson has argued, projecting present economic growth rates into the future also leads to a clash against fundamental limits:

Today we have about ten billion people with an average income about twenty times subsistence level, and the world economy doubles roughly every fifteen years. If that growth rate continued for ten thousand years[,] the total growth factor would be 10200.

There are roughly 1057 atoms in our solar system, and about 1070 atoms in our galaxy, which holds most of the mass within a million light years. So even if we had access to all the matter within a million light years, to grow by a factor of 10200each atom would on average have to support an economy equivalent to 10140 people at today’s standard of living, or one person with a standard of living 10140 times higher, or some mix of these.

Indeed, current growth rates would “only” have to continue for three thousand years before each atom in our galaxy would have to support an economy equivalent to a single person living at today’s living standard, which already seems rather implausible (not least because we can only access a tiny fraction of “all the matter within a million light years” in three thousand years). Hanson does not, however, expect the current growth rate to remain constant, but instead, based on the history of growth rates, expects a new growth mode where the world economy doubles within 15 days rather than 15 years:

If a new growth transition were to be similar to the last few, in terms of the number of doublings and the increase in the growth rate, then the remarkable consistency in the previous transitions allows a remarkably precise prediction. A new growth mode should arise sometime within about the next seven industry mode doublings (i.e., the next seventy years) and give a new wealth doubling time of between seven and sixteen days.

And given this more than a hundred times greater growth rate, the net growth that would take 10,000 years to accomplish given our current growth rate (cf. Hanson’s calculation above) would now take less than a century to reach, while growth otherwise requiring 3,000 years would require less than 30 years. So if Hanson is right, and we will see such a shift within the next seventy years, what seems to follow is that we will reach the limits of economic growth, or at least reach near-zero growth rates, within a century or two. Such a projection is also consistent with the physically derived limits of the continuation of Moore’s law; not that economic growth and Moore’s law are remotely the same, yet they are no doubt closely connected: economic growth is largely powered by technological progress, of which Moore’s law has been a considerable subset in recent times.

The conclusion we reach by projecting past growth trends in computing power, energy, and the economy is the same: our current growth rates cannot go on forever. In fact, they will have to decline to near-zero levels very soon on a cosmic timescale. Given the physical limits to computation, and hence, ultimately, to economic growth, we can conclude that we must be close to the point where peak relative growth in our economy and our ability to process information occurs — that is, the point where this growth rate is the highest in the entire history of our civilization, past and future.

Peak growth might lie in the past

This is not, however, to say that this point of maximum relative growth necessarily lies in the future. Indeed, in light of the declining economic growth rates we have seen over the last few decades, it cannot be ruled out that we are now already past the point of “peak economic growth” in the history of our civilization, with the highest growth rates having occurred around 1960-1980, cf. these declining growth rates and this essay by physicist Theodore Modis. This is not to say that we most likely are, yet it seems that the probability that we are is non-trivial.

A relevant data point here is that the global economy has seen three doublings since 1965, where the annual growth rate was around six percent, and yet the annual growth rate today is only a little over half — around 3 percent — of, and lies stably below, what it was those three doublings ago. In the entire history of economic growth, this seems unprecedented, suggesting that we may already be on the other side of the highest growth rates we will ever see. For up until this point, a three-time doubling of the economy has, rare fluctuations aside, led to an increase in the annual growth rate.

And this “past peak growth” hypothesis looks even stronger if we look at 1955, with a growth rate of a little less than six percent and a world product at 5,430 billion 1990 U.S dollars, which doubled four times gives just under 87,000 billion — about where we should expect today’s world product to be. Yet throughout the history of our economic development, four doublings has meant a clear increase in the annual growth rate, at least in terms of the underlying trend; not a stable decrease of almost 50 percent. This tentatively suggests that we should not expect to see growth rates significantly higher than those of today sustained in the future.

Could we be past peak growth in science and technology?

That peak growth lies in the past may also be true of technological progress in particular, or at least many forms of technological progress, including the progress in computing power tracked by Moore’s law, where the growth rate appears to have been highest around 1990-2005, and to since have been in decline, cf. this article and the first graphs found here and here. Similarly, various sources of data and proxies tracking the number of scientific articles published and references cited over time also suggest that we could be past peak growth in science as well, at least in many fields when evaluated based on such metrics, with peak growth seeming to have been reached around 2000-2010.

Yet again, these numbers — those tracking economic, technological, and scientific progress — are of course closely connected, as growth in each of these respects contributes to, and is even part of, growth in the others. Indeed, one study found the doubling time of the total number of scientific articles in recent decades to be 15 years, corresponding to an annual growth rate of 4.7 percent, strikingly similar to the growth rate of the global economy in recent decades. Thus, declining growth rates both in our economy, technology, and science cannot be considered wholly independent sources of evidence that growth rates are now declining for good. We can by no means rule out that growth rates might increase in all these areas in the future — although, as we saw above with respect to the limits of Moore’s law and economic progress, such an increase, if it is going to happen, must be imminent if current growth rates remain relatively stable.

Might recent trends make us bias-prone?

How might it be relevant that we may be past peak economic growth at this point? Could it mean that our expectations for the future are likely to be biased? Looking back toward the 1960s might be instructive in this regard. For when we look at our economic history up until the 1960s, it is not so strange that people made many unrealistic predictions about the future around this period. Because not only might it have appeared natural to project the high growth rate at the time to remain constant into the future, which would have led to today’s global GDP being more than twice of what it is; it might also have seemed reasonable to predict the growth rates to keep on rising even further. After all, that was what they had been doing consistently up until that point, so why should it not continue in the following decades, resulting in flying cars and conversing robots by the year 2000? Such expectations were not that unreasonable given the preceding economic trends.

The question is whether we might be similarly overoptimistic about future economic progress today given recent, possibly unique, growth trends, specifically the unprecedented increase in absolute annual growth that we have seen over the past two decades. The same may apply to the trends in scientific and technological progress cited above, where peak growth in many areas appears to have happened in the period 1990-2010, meaning that we could now be at a point where we are disposed to being overoptimistic about further progress.

Yet, again, it is highly uncertain at this point whether growth rates, of the economy in general and of progress in technology and science in particular, will increase again in the future. Future economic growth may not conform well to the model with roughly symmetric growth rates around the 1960s, although the model certainly deserves some weight. All we can say for sure is that growth rates must become near-zero relatively soon. What the path toward that point will look like remains an open question. We could well be in the midst of a temporary decline in growth rates that will be followed by growth rates significantly greater than those of the 1960s, cf. the new growth mode envisioned by Robin Hanson.2

Implications: This is an extremely special time

Applying the mediocrity principle, we should not expect to live in an extremely unique time. Yet, in light of the facts about the ultimate limits to growth seen above, it is clear that we do: we are living during the childhood of civilization where there is still rapid growth, at the pace of doublings within a couple of decades. If civilization persists with similar growth rates, it will soon become a grown-up with near-zero relative growth. And it will then look back at our time — today plus minus a couple of centuries, most likely — as the one where growth rates were by far the highest in its entire history, which may be more than a trillion years.

It seems that a few things follow from this. First, more than just being the time where growth rates are the highest, this may also, for that very reason, be the time where individuals can influence the future of civilization more than any other time. In other words, this may be the time where the outcome of the future is most sensitive to small changes, as it seems plausible, although far from clear, that small changes in the trajectory of civilization are most significant when growth rates are highest. An apt analogy might be a psychedelic balloon with fluctuating patterns on its surface, where the fluctuations that happen to occur when we blow up the balloon will then also be blown up and leave their mark in a way that fluctuations occurring before and after this critical growth period will not (just like quantum fluctuations in the early universe got blown up during cosmic expansion, and thereby in large part determined the grosser structure of the universe today). Similarly, it seems much more difficult to cause changes across all of civilization when it spans countless star systems compared to today.

That being said, it is not obvious that small changes — in our actions, say — are more significant in this period where growth rates are many orders of magnitude higher than in any other time. It could also be that such changes are more consequential when the absolute growth is the highest. Or perhaps when it is smallest, at least as we go backwards in time, as there were far fewer people back when growth rates were orders of magnitude lower than today, and hence any given individual comprised a much greater fraction of all individuals than an individual does today.

Still, we may well find ourselves in a period where we are uniquely positioned to make irreversible changes that will echo down throughout the entire future of civilization.3 To the extent that we are, this should arguably lead us to update toward trying to influence the far future rather than the near future. More than that, if it does hold true that the time where the greatest growth rates occur is indeed the time where small changes are most consequential, this suggests that we should increase our credence in the simulation hypothesis. For if realistic sentient simulations of the past become feasible at some point, the period where the future trajectory of civilization seems the most up for grabs would seem an especially relevant one to simulate and learn more about. However, one can also argue that the sheer historical uniqueness of our current growth rates alone, regardless of whether this is a time where the fate of our civilization is especially volatile, should lead us to increase this credence, as such uniqueness may make it a more interesting time to simulate, and because being in a special time in general should lead us to increase our credence in the simulation hypothesis (see for instance this talk for a case for why being in a special time makes the simulation hypothesis more likely).4

On the other hand, one could also argue that imminent near-zero growth rates, along with the weak indications that we may now be past peak growth in many respects, provide a reason to lower our credence in the simulation hypothesis, as these observations suggest that the ceiling for what will be feasible in the future may be lower than we naively expect in light of today’s high growth rates. And thus, one could argue, it should make us more skeptical of the central premise of the simulation hypothesis: that there will be (many) ancestor simulations in the future. To me, the consideration in favor of increased credence seems stronger, although it does not significantly move my overall credence in the hypothesis, as there are countless other factors to consider.5


Appendix: Questioning our assumptions

Caspar Oesterheld pointed out to me that it might be worth meditating on how confident we can be in these conclusions given that apparently solid predictions concerning the ultimate limits to growth have been made before, yet quite a few of these turned out to be wrong. Should we not be open to the possibility that the same might be true of (at least some of) the limits we reviewed in the beginning of this essay?

Could our understanding of physics be wrong?

One crucial difference to note is that these failed predictions were based on a set of assumptions — e.g. about the amount of natural resources and food that would be available — that seem far more questionable than the assumptions that go into the physics-based predictions we have reviewed here: that our apparently well-established physical laws and measurements indeed are valid, or at least roughly so. The epistemic status of this assumption seems a lot more solid, to put it mildly. So there does seem to be a crucial difference here. This is not to say, however, that we should not maintain some degree of doubt as to whether this assumption is correct (I would argue that we always should). It just seems that this degree of doubt should be quite low.

Yet, to continue the analogy above, what went wrong with the aforementioned predictions was not so much that limits did not exist, but rather that humans found ways of circumventing them through innovation. Could the same perhaps be the case here? Could we perhaps some day find ways of deriving energy from dark energy or some other yet unknown source, even though physicists seem skeptical? Or could we, as Ray Kurzweil speculates, access more matter and energy by finding ways of travelling faster than light, or by finding ways of accessing other parts of our notional multiverse? Might we even become able to create entirely new ones? Or to eventually rewrite the laws of nature as we please? (Perhaps by manipulating our notional simulators?) Again, I do not think any of these possibilities can be ruled out completely. Indeed, some physicists argue that the creation of new pocket universes might be possible, not in spite of “known” physical principles (or rather theories that most physicists seem to believe, such as inflationary theory), but as a consequence of them. However, it is not clear that anything from our world would be able to expand into, or derive anything from, the newly created worlds on any of these models (which of course does not mean that we should not worry about the emergence of such worlds, or the fate of other “worlds” that we perhaps could access).

All in all, the speculative possibilities raised above seem unlikely, yet they cannot be ruled out for sure. The limits we have reviewed here thus represent a best estimate given our current, admittedly incomplete, understanding of the universe in which we find ourselves, not an absolute guarantee. However, it should be noted that this uncertainty cuts both ways, in that the estimates we have reviewed could also overestimate the limits to various forms of growth by countless orders of magnitude.

Might our economic reasoning be wrong?

Less speculatively, I think, one can also question the validity of our considerations about the limits of economic progress. I argued that it seems implausible that we in three thousand years could have an economy so big that each atom in our galaxy would have to support an economy equivalent to a single person living at today’s living standard. Yet could one not argue that the size of the economy need not depend on matter in this direct way, and that it might instead depend on the possible representations that can be instantiated in matter? If economic value could be mediated by the possible permutations of matter, our argument about a single atom’s need to support entire economies might not have the force it appears to have. For instance, there are far more legal positions on a Go board than there are atoms in the visible universe, and that’s just legal positions on a Go board. Perhaps we need to be more careful when thinking about how atoms might be able to create and represent economic value?

It seems like there is a decent point here. Still, I think economic growth at current rates is doomed. First, it seems reasonable to be highly skeptical of the notion that mere potential states could have any real economic value. Today at least, what we value and pay for is not such “permutation potential”, but the actual state of things, which is as true of the digital realm as of the physical. We buy and stream digital files such as songs and movies because of the actual states of these files, while their potential states mean nothing to us. And even when we invest in something we think has great potential, like a start-up, the value we expect to be realized is still ultimately one that derives from its actual state, namely the actual state we hope it will assume, not its number of theoretically possible permutations.

It is not clear why this would change, or how it could. After all, the number of ways one can put all the atoms in the galaxy together is the same today as it will be ten thousand years from now. Organizing all these atoms into a single galactic supercomputer would only seem to increase the value of their actual state.

Second, economic growth still seems tightly constrained by the shackles of physical limitations. For it seems inescapable that economies, of any kind, are ultimately dependent on the transfer of resources, whether these take the form of information or concrete atoms. And such transfers require access to energy, the growth of which we know to be constrained, as is true of the growth of our ability to process information. As these underlying resources that constitute the lifeblood of any economy stop growing, it seems unlikely that the economy can avoid this fate as well. (Tom Murphy touches on similar questions in his analysis of the limits to economic growth.)

Again, we of course cannot exclude that something crucial might be missing from these considerations. Yet the conclusion that economic growth rates will decline to near-zero levels relatively soon, on a cosmic timescale at least, still seems a safe bet in my view.

Acknowledgments

I would like to thank Brian Tomasik, Caspar Oesterheld, Duncan Wilson, Kaj Sotala, Lukas Gloor, Magnus Dam, Max Daniel, and Tobias Baumann for valuable comments and inputs. This essay was originally published at the website of the Foundational Research Institute, now the Center on Long-Term Risk. 


Notes

1. One may wonder whether there might not be more efficient ways to derive energy from the non-stellar matter in our galaxy than to convert it into stars as we know them. I don’t know, yet a friend of mine who does research in plasma physics and fusion says that he does not think one could, especially if we, as we have done here, disregard the energy required to clump the dispersed matter together so as to “build” the star, a process that may well take more energy than the star can eventually deliver.

The aforementioned paper by Lawrence Krauss and Glenn Starkman also contains much information about the limits of energy use, and in fact uses accessible energy as the limiting factor that bounds the amount of information processing any (local) civilization could do (they assume that the energy that is harvested is beamed back to a “central observer”).

2. It should be noted, though, that Hanson by no means rules out that such a growth mode may never occur, and that we might already be past, or in the midst of, peak economic growth: “[…] it is certainly possible that the economy is approaching fundamental limits to economic growth rates or levels, so that no faster modes are possible […]”

3. The degree to which there is sensitivity to changes of course varies between different endeavors. For instance, natural science seems more convergent than moral philosophy, and thus its development is arguably less sensitive to the particular ideas of individuals working on it than the development of moral philosophy is.

4. One may then argue that this should lead us to update toward focusing more on the near future. This may be true. Yet should we update more toward focusing on the far future given our ostensibly unique position to influence it? Or should we update more toward focusing on the near future given increased credence in the simulation hypothesis? (Provided that we indeed do increase this credence, cf. the counter-consideration above.) In short, it mostly depends on the specific probabilities we assign to these possibilities. I myself happen to think the far future should dominate, as I assign the simulation hypothesis (as commonly conceived) a very small probability.

5. For instance, fundamental epistemological issues concerning how much one can infer based on impressions from a simulated world (which may only be your single mind) about a simulating one (e.g. do notions such as “time” and “memory” correspond to anything, or even make sense, in such a “world”?); the fact that the past cannot be simulated realistically, since we can only have incomplete information about a given physical state in the past (not only because we have no way to uncover all the relevant information, but also because we cannot possibly represent it all, even if we somehow could access it — for instance, we cannot faithfully represent the state of every atom in our solar system in any point in the past, as this would require too much information), and a simulation of the past that contains incomplete information would depart radically from how the actual past unfolded, as all of it has a non-negligible causal impact (even single photons, which, it appears, are detectable by the human eye), and this is especially true given that the vast majority of information would have to be excluded (both due to practical constraints to what can be recovered and what can be represented); whether conscious minds can exist on different levels of abstraction; etc.

Is AI Alignment Possible?

The problem of AI alignment is usually defined roughly as the problem of making powerful artificial intelligence do what we humans want it to do. My aim in this essay is to argue that this problem is less well-defined than many people seem to think, and to argue that it is indeed impossible to “solve” with any precision, not merely in practice but in principle.

There are two basic problems for AI alignment as commonly conceived. The first is that human values are non-unique. Indeed, in many respects, there is more disagreement about values than people tend to realize. The second problem is that even if we were to zoom in on the preferences of a single human, there is, I will argue, no way to instantiate a person’s preferences in a machine so as to make it act as this person would have preferred.

Problem I: Human Values Are Non-Unique

The common conception of the AI alignment problem is something like the following: we have a set of human preferences, X, which we must, somehow (and this is usually considered the really hard part), map onto some machine’s goal function, Y, via a map f, let’s say, such that X and Y are in some sense isomorphic. At least, this is a way of thinking about it that roughly tracks what people are trying to do.

Speaking in these terms, much attention is being devoted to Y and f compared to X. My argument in this essay is that we are deeply confused about the nature of X, and hence confused about AI alignment.

The first point of confusion is about the values of humanity as a whole. It is usually acknowledged that human values are fuzzy, and that there are some disagreements over values among humans. Yet it is rarely acknowledged just how strong this disagreement in fact is.

For example, concerning the ideal size of the future population of sentient beings, the disagreement is near-total, as some (e.g. some defenders of the so-called Asymmetry in population ethics, as well as anti-natalists such as David Benatar) argue that the future population should ideally be zero, while others, including many classical utilitarians, argue that the future population should ideally be very large. Many similar examples could be given of strong disagreements concerning the most fundamental and consequential of ethical issues, including whether any positive good can ever outweigh extreme suffering. And on many of these crucial disagreements, a very large number of people will be found on both sides.

Different answers to ethical questions of this sort do not merely give rise to small practical disagreements. In many cases, they imply completely opposite practical implications. This is not a matter of human values being fuzzy, but a matter of them being sharply, irreconcilably inconsistent. And hence there is no way to map the totality of human preferences, “X”, onto a single, well-defined goal-function in a way that does not conflict strongly with the values of a significant fraction of humanity. This is a trivial point, and yet most talk of human-aligned AI seems to skirt this fact.

Problem II: Present Human Preferences Are Underdetermined Relative to Future Actions

The second problem and point of confusion with respect to the nature of human preferences is that, even if we focus only on the present preferences of a single human, then these in fact do not, and indeed could not, determine with much precision what kind of world this person would prefer to bring about in the future.

One way to see this point is to think in terms of the information required to represent the world around us. A perfectly precise such representation would require an enormous amount of information, indeed far more information than what can be contained in our brain. This holds true even if we only consider morally relevant entities around us — on the planet, say. There are just too many of them for us to have a precise representation of them. By extension, there are also too many of them for us to be able to have precise preferences about their individual states. Given that we have very limited information at our disposal, all we can do is express extremely coarse-grained and compressed preferences about what state the world around us should ideally have. In other words, any given human’s preferences are bound to be extremely vague about the exact ideal state of the world right now, and there will be countless moral dilemmas occurring across the world right now to which our preferences, in their present state, do not specify a unique solution.

And yet this is just considering the present state of the world. When we consider future states, the problem of specifying ideal states and resolutions to hitherto unknown moral dilemmas only explodes in complexity, and indeed explodes exponentially as time progresses. It is simply a fact, and indeed quite an obvious one at that, that no single brain could possibly contain enough information to specify unique, or indeed just qualified, solutions to all moral dilemmas that will arrive in the future. So what, then, could AI alignment relative to even a single brain possibly mean? How can we specify Y with respect to these future dilemmas when X itself does not specify solutions?

We can, of course, try to guess what a given human, or we ourselves, might say if confronted with a particular future moral dilemma and given knowledge about it, yet the problem is that our extrapolated guess is bound to be just that: a highly imperfect guess. For even a tiny bit of extra knowledge or experience can readily change a person’s view of a given moral dilemma to be the opposite of what it was prior to acquiring that knowledge (for instance, I myself switched from being a classical to a negative utilitarian based on a modest amount of information in the form of arguments I had not considered before). This high sensitivity to small changes in our brain implies that even a system with near-perfect information about some person’s present brain state would be forced to make a highly uncertain guess about what that person would actually prefer in a given moral dilemma. And the further ahead in time we go, and thus further away from our familiar circumstance and context, the greater the uncertainty will be.

By analogy, consider the task of AI alignment with respect to our ancestors ten million years ago. What would their preferences have been with respect to, say, the future of space colonization? One may object that this is underdetermined because our ancestors could not conceive of this possibility, yet the same applies to us and things we cannot presently conceive of, such as alien states of consciousness. Our current preferences say about as little about the (dis)value of such states as the preferences of our ancestors ten million years ago said about space colonization.

A more tangible analogy might be to consider the level of confidence with which we, based on knowledge of your current brain state, can determine your dinner preferences twenty years from now with respect to dishes made from ingredients not yet invented — a preference that will likely be influenced by contingent, environmental factors found between now and then. Not with great confidence, it seems safe to say. And this point pertains not only to dinner preferences but also to the most consequential of choices. Our present preferences cannot realistically determine, with any considerable precision, what we would deem ideal in as yet unknown, realistic future scenarios. Thus, by extension, there can be no such thing as value extrapolation or preservation in anything but the vaguest sense. No human mind has ever contained, or indeed ever could contain, a set of preferences that evaluatively orders more than but the tiniest sliver of (highly compressed versions of) real-world states and choices an agent in our world is likely to face in the future. To think otherwise amounts to a strange Platonization of human preferences. We just do not have enough information in our heads to possess such fine-grained values.

The truth is that our preferences are not some fixed entity that determine future actions uniquely; they simply could not be that. Rather, our preferences are themselves interactive and adjustive in nature, changing in response to new experiences and new information we encounter. Thus, to say that we can “idealize” our present preferences so as to obtain answers to all realistic future moral dilemmas is rather like calling the evolution of our ancestors’ DNA toward human DNA a “DNA idealization”. In both cases, we find no hidden Deep Essences waiting to be purified; no information that points uniquely toward one particular solution in the face of all realistic future “problems”. All we find are physical systems that evolve contingently based on the inputs they receive.*

The bottom line of all this is not that it makes no sense to devote resources toward ensuring the safety of future machines. We can still meaningfully and cooperatively seek to instill rules and mechanisms in our machines and institutions that seem optimal in expectation given our respective, coarse-grained values. The conclusion here is just that 1) the rules instantiated cannot be the result of a universally shared human will or anything close; the closest thing possible would be rules that embody some compromise between people with strongly disagreeing values. And 2) such an instantiation of coarse-grained rules in fact comprises the upper bound of what we can expect to accomplish in this regard. Indeed, this is all we can expect with respect to future influence in general: rough and imprecise influence and guidance with the limited information we can possess and transmit. The idea of a future machine that will do exactly what we would want, and whose design therefore constitutes a lever for precise future control, is a pipe dream.


* Note that this account of our preferences is not inconsistent with value or moral realism. By analogy, consider human preferences and truth-seeking: humans are able to discover many truths about the universe, yet most of these truths are not hidden in, nor extrapolated from, our DNA or our preferences. Indeed, in many cases, we only discover these truths by actively transcending rather than “extrapolating” our immediate preferences (for comfortable and intuitive beliefs, say). The same could apply to the realm of value and morality.

Blog at WordPress.com.

Up ↑