Voice-activated digital assistants—such as the Amazon Echo that sits on your counter to Cortana on your Windows systems or Siri on Apple’s iPhones—are intended to connect users to services through an easy-to-use voice interface. However, the voice assistants are making cyber-attackers’ jobs easier as well.
At the Black Hat conference later this month, for example, four researchers will show how Cortana can be used to bypass the security on locked Windows PCs and other devices. While the group is exploiting a specific vulnerability—dubbed “Open Sesame”—the issues with voice assistants are deeper, said Tal Be’ery, an independent researcher and part of the team.
“Voice interfaces can be a good idea, but it is not relevant to all devices and all actions,” he said. “Enabling everything the PC does, and going through a voice interface on a corporate environment—this is not a very smart architecture decision.”
The research involves just the latest attack that utilizes voice assistants, which often prioritize convenience over security. Digital assistants have been added to phones and PCs as a convenient new way of interacting with the devices. Smart speakers—such as the Amazon Echo and the Google Home—have taken off, with 1 in 6 Americans owning one of the devices.
Yet, there already has been incidents. In January 2017, an on-air news caster said, “I love the little girl saying, ‘Alexa ordered me a dollhouse,'” leading to Alexa devices in viewers’ homes attempting to order dollhouses. And in May 2018, Amazon’s smart speaker picked up a couples’ conversation, recorded it, and sent it to a friend.
The incidents underscore that, in addition to bypassing many security controls, voice assistants are nothing less than sleepless sensors that are almost always listening for potential commands, which makes them a privacy issue.
“The cases that will be handled first are those that are triggered accidentally—like the dollhouse incident,” said Nicholas Carlini, a recent PhD graduate from the University of California, Berkeley, who researched adversarial attacks against artificial intelligence systems. “It is an active area of research of how to stop these issues.”
Here are five ways that voice assistants can be used to attack.
1. Hiding commands in the audio
Among adversarial attacks against machine-learning and artificial-intelligence systems are a class that attempt to change an input—an image for vision systems and an audio clip for voice systems—so that the machine recognizes it as something completely different.
UC Berkeley’s Carlini used just such a technique in his research by modifying an audio clip that transcribes to one phrase to a 99.9-percent similar clip that transcribes into a completely different phrase. The technique can even hide commands inside music.
Currently, the effort only works in the most controlled environments, but creating a generalized attack should be feasible, said Carlini.
“It’s still unknown whether this can be done over the air,” he said. “We tried some obvious things, but we didn’t try too hard…I believe it would be possible.”
2. Machines can hear it, you can’t
Hiding commands inside other audio is not the only way to create a covert way to manipulate voice assistants. In an attack presented in 2017, six researchers from Zhejiang University showed that they could use sound inaudible to human to command Siri to make a phone call or to take other actions.
Called the DolphinAttack, the hack shows that a lack of security can be used to command a voice assistant to visit a malicious site, spy on the users, inject fake information or conduct a denial-of-service attack, the researchers stated in their paper.
This “serves as a wake-up call to reconsider what functionality and levels of human interaction shall be supported in voice controllable systems,” the researchers said.
3. It this on? Yes, it is
Even when a voice assistant is not taking an action on your behalf, it continues to listen for commands. Like mobile phones, home voice assistants are sensors that know a lot about you. This gives the companies behind the devices a privileged place in your home, and your life, making them an ideal target for attackers.
“To operate, these devices need to listen all the time by design—once you say the keyword, and then they start collecting data and sending it to the cloud,” researcher Be’ery said. “So this is a bug that is placed in your house by design.”
In addition to malicious attacks, the devices have already been shown to expose privacy inadvertently. The incident where a couple was recorded by an Amazon Echo, required the device to mishear three commands or prompts before sending the message to a friend.
4. Trumping system security
Multiple portions of the code base in many general-purpose devices, such as a PC or a phone, could be exploited by hackers. This “attack surface area” is only made larger and more porous when you add voice-assistant technology and prioritize convenience over security, said researcher Be’ery.
Along with two researchers from the Israel Institute of Technology and the former chief technology officer of security firm Imperva, Be’ery will demonstrate at the Black Hat conference the weaknesses that the Cortana digital assistant adds to Windows devices.
“Introducing such a complex logic and extending it to so many places, all happening when the computer is supposed to be locked—it is not going to end up well,” he said. “There is too much attack surface area.”
5. Jumping from device to device
Attackers often find ways into a home through the router or an unsecured wireless network. Voice assistants add another vector that allows them to bridge attacks, using an audio device—such as a TV or even a loud car radio on the street—to issue commands to the devices.
The dollhouse incident is an inadvertent version of this attack.
For most of these issues, there is no easy solution. While filters can be put in place to limit using inputs outside of human hearing, most security fixes for the other problems would make the devices more difficult to use and so are only requested in certain cases, such as purchasing items or transferring money.
“From a usability aspect, the answer is no, we don’t want to add a second factor,” said Carlini. “I don’t see an obvious solution that is not to ask for a second factor.”