HOLLYWOOD has primed us to worry that evil robots will try to take over the world.
Indeed, according to the movie of the same name, the Terminator will boot up in 2029, so we may have only a dozen or so years left.
Was Stephen Hawking right to suggest that super-intelligence is our greatest existential threat?
The logical conclusion is that the super-intelligence would want us eliminated
At the moment we are the most intelligent species on the planet, and all other life depends on our goodwill for its continued existence.
Won’t our fate in turn depend on the goodwill of these superior, super-intelligent machines? Let’s turn next to the question of whether super intelligent machines might simply eliminate us. Will they mark the end of humankind?
In the movies, the machines are usually depicted as evil. But incompetence rather than malevolence seems the more likely risk. Of course we have to consider the possibility that super-intelligent machines might unintentionally end the human race. There are multiple scenarios in which this might occur.
The first risk scenario is that the design of the super-intelligent machine’s goals may fall short.
This type of dysfunction is illustrated by the Greek myth of King Midas. The king was granted his wish that everything he touched would turn to gold. But he had poorly specified what he really wanted to turn to gold — he didn’t want his food or his daughter to become a precious metal.
It could be argued that artificial intelligence (AI) has already witnessed this in some small and not very dangerous ways. For instance, researchers reported an experiment in 2017 in which they taught a computer to play Coast Runners, a boat-racing video game.
Rather than complete the racecourse, the AI learnt to go around in small circles, crashing into other boats because this increased the score more quickly than actually finishing the race.
As they are so smart, super-intelligent machines may surprise us in the way they achieve their goals.
Suppose we ask a super-intelligent machine to cure cancer. One way to do this might be to eliminate all hosts with the potential to house cancer — thus bringing an end to the human race. Not quite what we wanted.
Such examples suppose a rather dim view of super-intelligence. If I gave you the task of curing cancer and you started to kill people, I would probably decide you weren’t that intelligent.
We suppose intelligent people have learnt good values and are wise to the plight of others, especially those with sentience and feelings. Shouldn’t a super-intelligence be suitably wise as well as intelligent?
The second risk scenario is that even if the goals are properly specified, there may be undesirable side effects that hurt humanity. Anyone who has debugged some computer code knows how frustratingly literal computers are when they interpret instructions.
This risk is explored in a well-known thought experiment proposed by Nick Bostrom. Suppose we build a super-intelligent machine and give it the goal to build as many paperclips as possible. Because the machine is super-intelligent, it would be very good at making paperclips.
The machine might start building more and more paperclip factories. Eventually the whole planet would be turned into factories for building paperclips. The machine is doing precisely what it was asked to do, but the outcome is undesirable for humankind.
Now, Bostrom doesn’t actually believe we’d give a super-intelligence the goal of maximising paperclips, especially as we realise the risks of this particular goal. Paperclip production was just chosen to demonstrate that even a mundane, arbitrary and apparently harmless goal could go seriously astray.
Like the Midas argument, this supposes a rather poor view of super-intelligence. Shouldn’t a super-intelligence also be able to understand the implicit goals that are not explicitly specified? Yes, do make lots of paperclips, but not at the expense of the environment. And certainly not at the expense of the human race.
A third risk scenario is that any super-intelligence will have sub-goals that may conflict with humanity’s continued existence.
Suppose the super-intelligence has some overall goal, such as increasing human happiness or protecting the planet. Almost any goal like this that you can imagine will require the super-intelligence to develop the resources to carry out its actions. They will also require that the super-intelligence be allowed to continue to operate so it can achieve its goal.
But humans might turn the machine off. In addition, humans will consume resources that might be better used to achieve the super-intelligence’s own goals.
The logical conclusion then is that the super-intelligence would want us eliminated. We would not then be able to turn it off, or to consume resources that might be better used for its goals.
These sub-goals of self-preservation and the acquisition of resources have been called two of the “basic AI drives” by computer scientist Stephen Omohundro.
Such drives are the basic sub-goals that any sufficiently intelligent AI system is likely to have. The HAL 9000 computer in Arthur C. Clarke’s 2001: A Space Odyssey represents perhaps the best-known vision of the AI drive for self-preservation. HAL starts to kill the astronauts on board the Discovery One spacecraft in a desperate attempt to prevent them from switching the computer off.
Other basic AI drives are for improvement and creativity. AI systems will tend to become more efficient, both physically and computationally, as this will help them achieve whatever other goals they might have.
And, less predictably, AI systems will tend to be creative, looking for new ways to achieve their goals more efficiently and effectively. Efficiency isn’t a bad thing; it will help us conserve our planet’s limited resources.
But creativity is more of a challenge. It means that super-intelligent machines will be unpredictable. They will achieve their goals in ways that we might not predict.
A fourth risk scenario is that any super-intelligence could modify itself, start working differently, even assign itself new goals. This is especially likely if we give it the goal of making itself more intelligent.
How can we be sure that the redesigned super-intelligence remains aligned with our human values? Some harmless aspect of the original system might be amplified in the new, and could be very harmful to us.
The moving target might not be just the super-intelligence, but also the bigger system in which it operates.
We see this phenomenon in our human institutions: it goes under the name of “mission creep”. You decide to send military advisers into Vietnam, and a decade later you have hundreds of thousands of soldiers on the ground, fighting a war that can’t be won.
In a modest and not-too-harmful way, this moving target problem has already been observed in the context of AI.
In 2008, Google launched Google Flu Trends. It became the poster child for using big data for social good, predicting more effectively than previous methods the timing of the influenza season around the world.
Google Flu Trends used Google queries to predict when and where flu was becoming prevalent. If lots of people in a particular region start to ask Google “how to treat a sore throat?” and “what is a fever?” then perhaps flu is starting to spread.
But in 2013 Google Flu Trends simply stopped working. It has now been rather quietly dropped from Google’s offerings. What went wrong?
The problem was that Google (and the human ecosystem in which it sits) is a moving target.
Google has got better and better. And part of that improvement has been that Google suggests queries before the user has even finished typing them.
These improvements appear to have introduced biases into how people use Google, and in turn these have damaged the ability of Google Flu Trends to work.
By making Google better, we made flu epidemics more difficult to predict.
A fifth risk scenario is that any super-intelligence might simply be indifferent to our fate, just as I am indifferent to the fate of certain less intelligent life forms.
If I am building a new factory, I might not particularly worry about destroying an ant colony that’s in the way. I don’t go out of my way to destroy the ants, but they just happen to be where my factory needs to go. Similarly, a super-intelligence might not be greatly concerned about our existence.
If we happen to be in the way of its goals, we might simply be eliminated. The super-intelligence has no malevolence towards us; we are collateral damage.
The danger of an indifferent super-intelligence supposes that the super-intelligence is not dependent on humanity. We can destroy the ant colony without concern because its destruction is unlikely to bring any great side effects.
But destroying humanity might have some serious side effects that a super-intelligence would want to avoid. Who is providing the infrastructure that the super-intelligence is using? The servers running in the cloud? The electricity powering the cloud? The internet connecting the cloud together?
If humans are still involved in any of these services, the super-intelligence should not be indifferent to our fate.
Equally, the danger of an indifferent super-intelligence supposes that the super-intelligence might not be paternalistic towards us.
In fact, I’m not sure paternalistic is the right adjective here. Because of their immense intelligence, we would be like children that they might wish to protect. But we would also be their parents, whom they might wish to protect in gratitude for bringing them into existence.
Both are reasons for a super-intelligence not to be indifferent to our fate.
Artificial Intelligence Professor Toby Walsh will appear at Good Robot/Bad Robot: Living With Intelligent Machines, Sydney Opera House, Sunday, August 12. sydneyoperahouse.com
2062: The World That AI Made by Toby Walsh, La Trobe University Press, $34.99, published
Monday, see
blackincbooks.com.au