AI safety – Magnus Vinding

The following is an excerpt from my book Reflections on Intelligence (2016/2024).

The term “Artificial General Intelligence” (AGI) refers to a machine that can perform any cognitive task at least as well as any human. This is often considered the holy grail of artificial intelligence research. It is also what many believe will give rise to an “intelligence explosion”, as machines will then be able to take over the design of smarter machines, and hence their further development will no longer be held back by the slowness of humans.

A Radical Shift?

Luke Muehlhauser and Anna Salamon describe the transition toward machines designing machines in the following way:

Once human programmers build an AI with a better-than-human capacity for AI design, the instrumental goal for self-improvement may motivate a positive feedback loop of self-enhancement. Now when the machine intelligence improves itself, it improves the intelligence that does the improving. (Muehlhauser & Salamon, 2012, p. 13)

While this might seem like a radical shift, software engineer Ramez Naam has argued that it is less radical than we might think, since we already use our latest technology to improve on itself and build the next generation of technology (Naam, 2010). As noted in the previous chapter, the way new tools are built and improved is by means of an enormous conglomerate of tools, and newly developed tools tend to become an addition to this existing set of tools. In Naam’s words:

[A] common assertion is that the advent of greater-than-human intelligence will herald The Singularity. These super intelligences will be able to advance science and technology faster than unaugmented humans can. They’ll be able to understand things that baseline humans can’t. And perhaps most importantly, they’ll be able to use their superior intellectual powers to improve on themselves, leading to an upward spiral of self improvement with faster and faster cycles each time.

In reality, we already have greater-than-human intelligences. They’re all around us. And indeed, they drive forward the frontiers of science and technology in ways that unaugmented individual humans can’t.

These superhuman intelligences are the distributed intelligences formed of humans, collaborating with one another, often via electronic means, and almost invariably with support from software systems and vast online repositories of knowledge. (Naam, 2010)

The design and construction of new machines is not the product of human ingenuity alone, but instead the product of a large system of advanced tools in which human ingenuity is just one component, albeit a component that plays many roles. Moreover, as Naam hints, superhuman intellectual abilities already play a crucial role in this design process. For example, computer programs make illustrations and calculations that no human could possibly make, and these have become indispensable components in the design of new tools in virtually all technological domains. In this way, superhuman intellectual abilities are already a significant part of the process of building superhuman intellectual abilities. This has led to continued growth, yet hardly an abrupt intelligence explosion.

Naam gives a specific example of an existing self-improving “superintelligence” (i.e. a super goal achiever), namely Intel:

Intel employs giant teams of humans and computers to design the next generation of its microprocessors. Faster chips mean that the computers it uses in the design become more powerful. More powerful computers mean that Intel can do more sophisticated simulations, that its CAD (computer aided design) software can take more of the burden off of the many hundreds of humans working on each chip design, and so on. There’s a direct feedback loop between Intel’s output and its own capabilities. …

Self-improving superintelligences have changed our lives tremendously, of course. But they don’t seem to have spiraled into a hard takeoff towards “singularity”. On a percentage basis, Google’s growth in revenue, in employees, and in servers have all slowed over time. It’s still a rapidly growing company, but that growth rate is slowly decelerating, not accelerating. The same is true of Intel and of the bulk of tech companies that have achieved a reasonable size. Larger typically means slower growing.

My point here is that neither superintelligence nor the ability to improve or augment oneself always lead to runaway growth. Positive feedback loops are a tremendously powerful force, but in nature (and here I’m liberally including corporate structures and the worldwide market economy in general as part of ‘nature’) negative feedback loops come into play as well, and tend to put brakes on growth. (Naam, 2010)

I quote Naam at length here because he makes this important point well, and because he is an expert with experience in the pursuit of using technology to make better technology. In addition to Naam’s point about Intel and other large tech companies that effectively improve themselves, I would add that although such mega-companies are highly competent collectives, they still only constitute a tiny part of the larger collective system that is the world economy, which they each contribute modestly to, and which they are entirely dependent upon.

A Familiar Dynamic

It has always been the latest, most advanced tools that, combined with the already existing set of tools, have collaborated to build the latest, most advanced tools. The expected “machines building machines” revolution is therefore not as revolutionary as it might seem at first sight. Strong versions of the “once machines can program AI better than humans” argument seem to assume that human software engineers are by far the main bottleneck to progress in the construction of more competent machines, which is a questionable premise. But even if it were true, and if we suddenly had a million times as many agents working to create better software, other bottlenecks would soon emerge, such as hardware production and energy. Essentially, we would be returned to the task of advancing our entire economy, something that pretty much all humans and machines are participating in already, knowingly or not.

The question concerning whether “intelligence” can explode is therefore basically: can the economy explode? To which we can answer that rapid increases in the growth rate of the world economy certainly have occurred in the past, and some argue that this is likely to happen again in the future (Hanson 1998; 2016). However, recent trends in economic growth, as well as in hardware growth in particular, give us some reason to be skeptical of such a future growth explosion (see e.g. Vinding, 2021; 2022).

The following is an excerpt from my book Reflections on Intelligence (2016/2024).

A question that is often considered open, sometimes even irrelevant, when it comes to “AGIs” and “superintelligences” is whether such entities would be conscious. Here is Nick Bostrom expressing such a sentiment:

By a “superintelligence” we mean an intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills. This definition leaves open how the superintelligence is implemented: it could be a digital computer, an ensemble of networked computers, cultured cortical tissue or what have you. It also leaves open whether the superintelligence is conscious and has subjective experiences. (Bostrom, 2012, “Definition of ‘superintelligence’”)

Yet this is hardly true. If a system is “more capable than the best human brains in practically every field, including scientific creativity, general wisdom, and social skills”, the question of consciousness is highly relevant. Consciousness is integral to much of what we do and excel at, and thus if an entity is not conscious, it cannot outperform the best humans “in practically every field”, especially not in “general wisdom” and “scientific creativity”. Let us look at these in turn.

General Wisdom

A core aspect of “general wisdom” is to be wise about ethical issues. Yet being wise about ethical issues requires that one can consider and evaluate questions like the following in an informed manner:

Is there anything about the experience of suffering that makes its reduction a moral priority
Does anything about the experience of suffering justify the claim that reducing suffering has greater moral priority than increasing happiness (for the already happy)?
Is there anything about states of extreme suffering that make their reduction an overriding moral priority?

It seems that one would have to be conscious in order to explore and answer such questions in an informed way. That is, one would have to know what such experiences are like in order to understand their experiential properties and significance. Knowing what a term like “suffering” refers to — i.e. knowing what actual experiences of suffering are like — is thus crucial for informed ethical reflection.

The same point holds true about other areas of philosophy that bear on wisdom, such as the philosophy of mind: without knowing what it is like to have a conscious mind, one cannot contribute much to the discussion about what it is like to have one and to the exploration of different modes of consciousness. Indeed, an unconscious entity has no genuine understanding about what the issue of consciousness is even about in the first place (Pearce, 2012a; 2012b).

So both in ethics and in the philosophy of mind, an unconscious entity would be less than clueless about many of the deepest questions at hand. If an entity not only fails to surpass humans in these areas, but fails to even have the slightest clue about what we are talking about, it hardly surpasses the best humans in practically every field. After all, questions about the phenomenology of consciousness are also relevant to many other fields, including psychology, epistemology, and ontology.

In short, experiencing and reasoning about consciousness is a key part of “human abilities”, and hence an entity that is unable to do this cannot be claimed to outperform humans in the most important, much less all, human abilities (see also Pearce, 2012a; 2012b).

Scientific Creativity

Another ability mentioned above that an unconscious entity could supposedly outdo humans at is scientific creativity. Yet scientific creativity must relate to all fields of knowledge, including the science of the conscious mind itself. This is also a part of the natural world, and a most relevant one at that.

Experiencing and accurately reporting what a given state of consciousness is like is essential for the science of mind, yet an unconscious entity obviously cannot do such a thing, as there is no experience it can report from. It cannot display any genuine scientific creativity, or even produce mere observations, in the direct exploration of consciousness.

Tag: AI safety

When Machines Improve Machines

A Radical Shift?

A Familiar Dynamic

Consciousness: Orthogonal or Crucial?

General Wisdom

Scientific Creativity