Eight years ago, a patient lost her power of speech because of ALS, or Lou Gehrig’s disease, which causes progressive paralysis. She can still make sounds, but her words have become unintelligible, leaving her reliant on a writing board or iPad to communicate.
Now, after volunteering to receive a brain implant, the woman has been able to rapidly communicate phrases like “I don’t own my home” and “It’s just tough” at a rate approaching normal speech.
That is the claim in a paper published over the weekend on the website bioRxiv by a team at Stanford University. The study has not been formally reviewed by other researchers. The scientists say their volunteer, identified only as “subject T12,” smashed previous records by using the brain-reading implant to communicate at a rate of 62 words a minute, three times the previous best.
Philip Sabes, a researcher at the University of California, San Francisco, who was not involved in the project, called the results a “big breakthrough” and said that experimental brain-reading technology could be ready to leave the lab and become a useful product soon.
“The performance in this paper is already at a level which many people who cannot speak would want, if the device were ready,” says Sabes. “People are going to want this.”
People without speech deficits typically talk at a rate of about 160 words a minute. Even in an era of keyboards, thumb-typing, emojis, and internet abbreviations, speech remains the fastest form of human-to-human communication.
The new research was carried out at Stanford University. The preprint, published January 21, began drawing extra attention on Twitter and other social media because of the death this week of its co-lead author, Krishna Shenoy, from pancreatic cancer.
Shenoy had devoted his career to improving the speed of communication through brain interfaces, carefully maintaining a list of records on his personal website. In 2019, another volunteer Shenoy worked with managed to use his thoughts to type at a rate of 18 words a minutes, a record performance at the time, as we related in MIT Technology Review’s special issue on computing.
The brain-computer interfaces that Shenoy’s team works with involve a small pad of sharp electrodes embedded in a person’s motor cortex, the brain region most involved in movement. This allows researchers to record activity from a few dozen neurons at once and find patterns that reflect what motions someone is thinking of, even if the person is paralyzed.
In previous work, paralyzed volunteers have been asked to imagine making hand movements. By “decoding” their neural signals in real time, implants have let them steer a cursor around a screen, pick out letters on a virtual keyboard, play video games, or even control a robotic arm.
In the new research, the Stanford team wanted to know if neurons in the motor cortex contained useful information about speech movements, too. That is, could they detect how “subject T12” was trying to move her mouth, tongue, and vocal cords as she attempted to talk?
These are small, subtle movements, and according to Sabes, one big discovery is that just a few neurons contained enough information to let a computer program predict, with good accuracy, what words the patient was trying to say. That information was conveyed by Shenoy’s team to a computer screen, where the patient’s words appeared as they were spoken by the computer.
The new result builds on previous work by Edward Chang at the University of California, San Francisco, who has written that speech involves the most complicated movements people make. We push out air, add vibrations that make it audible, and form it into words with our mouth, lips, and tongue. To make the sound “f,” you put your top teeth on your lower lip and push air out—just one of dozens of mouth movements needed to speak.
A path forward
Chang previously used electrodes placed on top of the brain to permit a volunteer to speak through a computer, but in their preprint, the Stanford researchers say their system is more accurate and three to four times faster.
“Our results show a feasible path forward to restore communication to people with paralysis at conversational speeds,” wrote the researchers, who included Shenoy and neurosurgeon Jaimie Henderson.
David Moses, who works with Chang’s team at UCSF, says the current work reaches “impressive new performance benchmarks.” Yet even as records continue to be broken, he says, “it will become increasingly important to demonstrate stable and reliable performance over multi-year time scales.” Any commercial brain implant could have a difficult time getting past regulators, especially if it degrades over time or if the accuracy of the recording falls off.
The path forward is likely to include both more sophisticated implants and closer integration with artificial intelligence.
The current system already uses a couple of types of machine learning programs. To improve its accuracy, the Stanford team employed software that predicts what word typically comes next in a sentence. “I” is more often followed by “am” than “ham,” even though these words sound similar and could produce similar patterns in someone’s brain.
Adding the word prediction system increased how quickly the subject could speak without mistakes.
Language models
But newer “large” language models, like GPT-3, are capable of writing entire essays and answering questions. Connecting these to brain interfaces could enable people using the system to speak even faster, just because the system will be better at guessing what they are trying to say on the basis of partial information. “The success of large language models over the last few years makes me think that a speech prosthesis is close at hand, because maybe you don’t need such an impressive input to get speech out,” says Sabes.
Shenoy’s group is part of a consortium called BrainGate that has placed electrodes into the brains of more than a dozen volunteers. They use an implant called the Utah Array, a rigid metal square with about 100 needle-like electrodes.
Some companies, including Elon Musk’s brain interface company, Neuralink, and a startup called Paradromics, say they have developed more modern interfaces that can record from thousands—even tens of thousands—of neurons at once.
While some skeptics have asked whether measuring from more neurons at one time will make any difference, the new report suggests it will, especially if the job is to brain-read complex movements such as speech.
The Stanford scientists found that the more neurons they read from at once, the fewer errors they made in understanding what “T12” was trying to say.
“This is a big deal, because it suggests efforts by companies like Neuralink to put 1,000 electrodes into the brain will make a difference, if the task is sufficiently rich,” says Sabes, who previously worked as a senior scientist at Neuralink.