Earlier this month, DeepMind presented a new “generalist” AI model called Gato. The model can play the video game Atari, caption images, chat, and stack blocks with a real robot arm, the Alphabet-owned AI lab announced. All in all, Gato can do 604 different tasks.
But while Gato is undeniably fascinating, in the week since its release some researchers have gotten a bit carried away.
One of DeepMind’s top researchers and a coauthor of the Gato paper, Nando de Freitas, couldn’t contain his excitement. “The game is over!” he tweeted, suggesting that there is now a clear path from Gato to artificial general intelligence, or ‘AGI’, a vague concept of human or superhuman-level AI. The way to build AGI, he claimed, is mostly a question of scale: making models such as Gato bigger and better.
Unsurprisingly, de Freitas’s announcement triggered breathless press coverage that Deepmind is “on the verge” of human-level artificial intelligence. This is not the first time hype has outstripped reality. Other exciting new AI models, such as OpenAI’s text generator GPT-3 and image generator DALL-E, have generated a similar amount of grand claims. For many in the field, this kind of feverish discourse overshadows other important research areas in AI.
That’s a shame because Gato is an interesting step in AI. Some models have started to mix different skills, like DALL-E, which generates images from text descriptions. Others use a single training technique to learn to recognise pictures and sentences. And DeepMind’s AlphaZero learned to play Go, chess and shogi.
But here’s the crucial difference: AlphaZero could only learn one task at a time. After learning to play Go, it had to forget everything before learning to play chess, and so on. It could not learn to play both games at once. This is what Gato does: learns multiple different tasks at the same time, which means it can switch between them without having to forget one skill before learning another. It’s a small step but a significant one.
But Gato performs tasks worse than models that can only do one thing. Robots still need to learn “common sense knowledge” about how the world works from text, says Jacob Andreas, an assistant professor at MIT who specializes in artificial intelligence and natural language and speech processing.
This could come in handy in robots that could help people around the house, for example. “When you drop [a robot] into a kitchen and ask them to make a cup of tea for the first time, they know what steps are involved in making a cup of tea and in which cabinet tea bags are likely to be located in,” says Andreas.
Some external researchers were explicitly dismissive of de Freitas’ claim. “This is far from being ‘intelligent’,” says Gary Marcus, an AI researcher who has been critical of deep learning. The hype around Gato demonstrated that the field of AI is blighted by an unhelpful “triumphalist culture,” he says.
He argues that the deep learning models that often generate the most excitement about the potential to reach human-level intelligence make mistakes that “if a human made these errors, you’d be like, something’s wrong with this person,” Marcus says.
“Nature is trying to tell us something here, which is, this doesn’t really work, but the field is so believing its own press clippings, that it just can’t see that,” he adds.
Even de Freitas’s DeepMind colleagues, Jackie Kay and Scott Reed, who worked with him on Gato, were more circumspect when I asked them directly about his claims. When asked about whether Gato was heading towards AGI, they wouldn’t be drawn. “I don’t actually think it’s really feasible to make predictions with these kinds of things. I try to avoid that. It’s like predicting the stock market,” said Kay.
Reed said the question was a difficult one. “I think most machine learning people will studiously avoid answering. Very hard to predict, but, you know, hopefully we get there someday.”
In a way, the fact that DeepMind called Gato a “generalist” might have made it a victim of the AI sector’s excessive hype around AGI. The AI systems of today are called “narrow” AI, meaning they can only do a specific, restricted set of tasks such as generate text.
Some technologists, including at Deepmind, think that one day humans will develop “broader” AI systems that will be able to function as well or even better than humans. Some call this artificial “general” intelligence. Others say it is like “belief in magic.“ Many top researchers, such as Meta’s chief AI scientist Yann LeCun question whether it is even possible at all.
Gato is a “generalist” in the sense that it can do many different things at the same time. But that is a world apart from a “general” AI that can meaningfully adapt to new tasks that are different from what the model was trained on, says MIT’s Andreas. “We’re still quite far from being able to do that.”
Making models bigger will also not address the issue that models don’t have “lifelong learning”, meaning they can be taught things once and they will understand all of the implications and use it to inform all of the other decisions that they are going to make, he says.
The hype around tools like Gato is harmful for the general development of AI, argues Emmanuel Kahembwe, an AI/robotics researcher and part of the Black in AI organization co-founded by Timnit Gebru. “There are many interesting topics that are left to the side, that are underfunded, that deserve more attention, but that’s not what the big tech companies and the bulk of researchers in such tech companies are interested in,” he says.
Tech companies ought to take a step back and take stock of why they are building what they are building, says Vilas Dhar, president of the Patrick J. McGovern Foundation, a charity that funds AI projects “for good.”
“AGI speaks to something deeply human—the idea that we can become more than we are, by building tools that propel us to greatness,” he says. “And that’s really nice, except it also is a way to distract us from the fact that we have real problems that face us today that we should be trying to address using AI.”