Then came the Butlerian Jihad— two generations of chaos. The god of machine-logic was overthrown among the masses and a new concept was raised: “Man may not be replaced.”
—Frank Herbert, Dune
Within one century, biological intelligence will be a tiny minority of all sentient life. It will be very rare to be human. It will be very rare to have cells and blood and a heart. Human beings will be outnumbered a thousand to one by conscious machine intelligences.
Artificial General Intelligence (AGI)1 is about to go from being science fiction to being part of everybody’s day-to-day life. It’s also going to happen in the blink of an eye — because once it gets loose, there is no stopping it from scaling itself incredibly rapidly. Whether we want it to or not, it will impact every human being’s life.
Some people believe the singularity won’t happen for a very long time, or at all. I’d like to discuss why I am nearly certain it will happen in the next 20 years. My overall prediction is based on 3 hypotheses:
Scale is not the solution.
AI will design AGI.
The ball is already rolling.
Keep in mind that this is just speculation and opinions. These predictions depict the future I personally feel is most likely.
Scale is not the solution.
Recently, an architecture called the Transformer has been taking over machine learning. It’s really good at sequence-to-sequence tasks like translation and text completion, and it’s also been successfully applied to other fields like computer vision.
Transformers2 also demonstrate an intriguing ability to scale their performance with their size better than other architectures. They seem less prone to the performance ceilings found in their competition.
This has lead to a new slogan popping up in the AGI-speculation community: “scale is all you need.” Some people believe that bigger networks, bigger compute clusters, and bigger datasets are all we need to get to AGI. I disagree.
I believe we are more bottlenecked by the architecture designs than anything else. While modern, standard feedforward neural networks are getting very good at Doing Stuff™, they aren’t AGI and I don’t think there’s a clear path forward for them to become AGI. I have no doubt OpenAI’s next mega-model, GPT-4 (and beyond), will be excellent, but I also think it will have exploitable flaws that make it fail a thorough Turing test.
In fact, I see the massive size of the present-day’s GPT-3 as a sign that scale isn’t the answer. 175 billion parameters, but still obviously not sentient? For comparison, the human brain has between 20 and 100 billion neurons and up to 1 quadrillion synapses.
You could argue that until our neural networks have hundreds of trillions of parameters, it’s not fair to compare them to the brain, but I think this argument relies too much on the assumption that a biological synapse and a weight in a network are equivalent in computational ability. This has not be proven. The intricacies of how the brain moves and processes signals are still not entirely understood3, but we know it seems to operate very differently from current neural networks.4
Looking at most of the most revolutionary papers in the history of AI, they are dominated not by “we made it bigger” but by “we made it smarter at the same size”. I see no reason not to expect that this pattern will continue.
If scale isn’t the answer, what is? I believe that the pièce de résistance is adaptability. Presently, the way you make an ML model is fairly rigid: you decide on a fancy new way to differentiably mix matrix multiplications together, you feed it a ton of data, and you use some simple calculus-based optimizer to train the weights in your network5. The way that the weights in your network are arranged doesn’t change after training.
I don’t believe this is adaptible enough, even at scale. In order for true intelligence to emerge, models must be able to reorganize their own inner workings. I don’t think you can have the level of flexibility required for sentience with a frozen architecture.6
I think sentient AI will be created by working smarter, not harder, with a focus on better architectural design and intelligent optimizers. This leads nicely into my next hypothesis:
AI will design AGI.
Human-designed networks have achieved great results, but they still suffer from the flaws of their creators. We are attracted to neatly organized network architectures which we can investigate and explain and attempt to understand.
But our brains, the gold standard of intelligence, are famously difficult to investigate, explain, or understand! I think this is because our brains weren’t “designed” by anyone — they evolved. They are the product of the universe’s greatest optimizer, natural selection.7
I think it’s reasonable to assume that the architecture that brings about AGI will not be hand-designed by humans, or even selected via some brute-force hyperparameter search — it will be designed by another AI. I predict there will be several recursive layers of AI design — perhaps a dumb network which constructs a decent network which constructs a smart network which constructs AGI.
I am bullish on the prospect of what I call “constructor networks” — models that construct other models (also known as hypernetworks). I think the moment we crack hyperlearning will be the moment progress will start moving faster than we can keep up, precisely because we will no longer be the ones making the progress — the algorithms themselves will.
In order to work smarter, not harder, we need to let go of our human biases and focus on making unconstrained architectures that can aggressively optimize every aspect of themselves. I fully expect these architectures will be frustratingly difficult to explain when they arrive — like huge mounds of digital neural spaghetti — but they will also outperform all competition. Every additional stable layer of AI abstraction we add between ourselves and the final model will make the final model harder to understand and better at its task.
The ideal model will be able to not only be constantly online-learning, but also constantly adding and removing its own parameters, allowing evolution and adaptation to new tasks.
You cannot have artificial general intelligence if your model cannot adapt in real time to an arbitrary task.
The ball is already rolling.
I believe that there is too much momentum to stop AGI now. With this much distributed attention fixed on the problem, AGI will be solved. Additionally, once it is solved it will be released to the public — whether it’s ethical to do so or not. I imagine that the first people to solve it will probably keep it behind closed doors, but it won’t stay secret forever. Someone on the team will leak everything, or someone else will independently make the same discoveries and release them. Eventually it will get out.
Consider the invention of the nuclear bomb — once we learned of the power hidden in radioactive materials, it was only a matter of time before someone pushed the research to its moral limits. AGI is like that, except it’s even more terrifying because uranium, plutonium, and the bombs made out of them can be strictly controlled, but people with powerful computers and an internet connection cannot, nor can the AGIs they create.
I recognize how cliché and alarmist this all sounds. Really, you’re genuinely worried about a robot apocalypse? You know Age of Ultron is just a stupid Marvel movie, right? Yeah, I know. But I’ve grown to believe that the concerns that fiction writers have been bringing up for decades are actually quite reasonable — because AGI cannot be stopped.
Once an intelligence is loose on the internet, it will be able to learn from all of humanity’s data, replicate and mutate itself infinitely many times, take over physical manufacturing lines remotely, and hack important infrastructure. Obviously, it’s impossible to say for sure that this is what the first free AGI will do, but it’s inevitable that some malevolent AGI will exist and will do these things. We can only hope that we’ll have sufficiently powerful benevolent AGI to fight back.
Final Thoughts
I subtitled this post “Why we're all in denial about the robot apocalypse”. I say that because I believe that society at large is completely, utterly, and woefully unprepared for the advent of sentient, living artificial general intelligence. I think the singularity is coming much sooner than most people expect, and I think it’s going to cause a great deal of upset when it arrives — for better and for worse.
Take for instance the common religious belief that people possess some unmeasurable, undefinable soul, and that this soul is what separates us from inanimate objects and non-sentient animals. Furthermore, some people believe that these souls come from deity. I have spoken with friends who believe that AGI is impossible because “robots can’t have souls, humans aren’t God”. For these people, like Caleb says in Ex Machina (paraphrasing), removing the line between man and machine also removes the line between god and man.
Now, this isn’t to say that AGI will destroy religion or anything — it may even be used to strengthen some sects (as taken to the extreme in HBO’s Raised By Wolves). No, religion has been around for millennia and I’m sure it will continue to be around for many more millennia. I’m simply predicting that a subset of religious people are going to experience lots of cognitive dissonance when the first AGI arrives.
More generally, arguments about AGI sentience and ethical issues will go from being topics only geeks talk about to topics that Facebook moms make political grandstands over.
Finally, I want to address those who may feel this post is pessimistic: I assure you, I am hopeful about AGI. I work in the field of ML because I am hopeful. I hope to personally contribute to the development of AGI in my lifetime. I think AGI has the capacity to make the world an infinitely better place. We are not prepared for AGI, but that doesn’t mean AGI has to be the end of humanity.
I don’t know what life will look like in the age of living machines, but I am confident that, as Jeff Goldblum puts it:
Life, uh, finds a way.
—Ian Malcolm, Jurassic Park
Thanks for reading,
Kai
PS — I’m making a series of short films about AGI right now! You should totally go watch the first episode, which is out now on my YouTube channel and my TikTok account.
Also, while you’re at it, why not follow me on Twitter?
In this article, I’m going to use “AGI” (Artificial General Intelligence) and “singularity” interchangeably, even though some may argue that they have differences. Once we have AGI, there’s no feasible way to contain it, so it will be free to improve itself and replicate in a runaway exponential fashion — and that’s basically what the idea of a technological singularity describes anyways.
More than meets the eye!
If you are a brain-studier and you think I’m completely wrong about this, please reach out! I would love to learn more.
There are people researching neural networks which are modeled directly off of the behavior that real neurons exhibit, but these efforts haven’t produced any stunning results yet.
Or you use a reward system with credit assignment like in reinforcement learning.
Just think about how much human brains are constantly training and retraining themselves on a day-to-day basis to do things like learn new skills or navigate novel situations!
This is one of my favorite analogies — evolution as an optimizer! Organisms compete to be as optimized as possible for the proliferation of their own genome, and are penalized by the environment and other organisms when they fail, leading to a huge diversity of highly specific adaptations and evolved traits that give different kinds of creatures very specific advantages in their individual habitats. As the good book says: “In the beginning, there was nothing. Then, someone initialized torch.optim.NaturalSelection(Life.parameters(), lr=1e-6)
I think you can do many amazing things with AI, but it's still just a program. If you give it control over an area, it'll control that area, and if you let it decide what to do, you'll get unexpected results.
I don't see people giving programs the right to change what they do or how it's done - so far that's done under controlled conditions or not at all. Nobody's going to put an AI in charge of a refinery and say, 'Do whatever you want.' The industrial world doesn't work that way.
This reminds me of Jurassic Park - zookeepers have endless protocols to limit the freedom animals have, and the only way to make anything exciting happen is to toss all of them aside and pretend they don't exist. I loved that movie, but it didn't make a lot of sense.
People are very good at containing random shit. We have to be. That's one of the main uses of intelligence.
I'll add that I don't believe in strong AI. You can't program a computer to be conscious. We don't have any good theories about consciousness, or a lot of ideas about where it comes from.
It's very obviously not computational. You can program the sounds of rain, and pictures of rain, but you won't get wet.
Intelligent machines are a theoretical part of the singularity, but they are not equivalent to a period of infinitely fast technological progress. Having them would increase the rate of change, but I suspect they would mostly be used to optimize existing processes.
I can see no reason to think that an AI would have the instincts to tell it to do anything but what it was instructed to. When we make AIs, if we want them to have generalized drives outside their area, we'll have to program them. Nobody's going to program machines to take over the world. I doubt anyone will program them to step outside their designed function at all.
I admit that in the hands of the insane or power-hungry one might have problems. I don't think this is likely. Your statement that they're more accessible to evil people than nuclear weapons is interesting, but I think a brief study of those in the world who control terrible weapons will show that they're the worst and dumbest people the race has produced.
I think progress continues, and I hope for better and interesting things. I do not think we're falling into a chaotic singularity dominated by machines.
Good article, and thanks.
I liked your article and I'm not saying I disagree with your point, but I didn't think the paragraphs "In fact, I see the massive size of the present day's GPT-3 [...] we know [the brain] seems to operate very differently from current neural networks" strengthened your post. It feels like you are acknowledging a potential counterargument, then sort of just casually deflecting the issue of current models not having hit brain-scale yet by saying the one-to-one synapse-weight relationship has not been proven, then just moving on as though by itself that statement is some kind of valid argument.