you’re looking for a single person, somewhere in the world. (We’ll
call him Waldo.) You know who he is, nearly everything about him,
but you don’t know where he’s hiding. How do you find him?
scale is just too great for anything but a computerized scan. The
first chance is facial recognition — scan his face against cameras
at airports or photos on social media — although you’ll be counting
on Waldo walking past a friendly camera and giving it a good view.
But his voice could be even better: How long could Waldo go without
making a phone call on public lines? And even if he’s careful about
phone calls, the world is full of microphones — how long before he
gets picked up in the background while his friend talks to her Echo?
it turns out, the NSA had roughly the same idea. InanInterceptpiece
on Friday, reporter Ava Kofman detailed the secret history of
the NSA’s speaker recognition systems, dating back as far as 2004.
One of the programs was a system known as Voice RT, which was able
to match speakers to a given voiceprint (essentially solving the
Waldo problem), along with generating basic transcriptions.
According to classified documents, the system wasdeployed
in 2009to track the Pakistani army’s chief
of staff, although officials expressed concern that there were too
few voice clips to build a viable model. The same systems scanned
voice traffic to more than 100 Iranian delegates’ phones when
President Mahmoud Ahmadinejadvisited
New York City in 2007.
seen voice recognition systems like this before — most recentlywith
the Coast Guard— but there’s never been one as far-reaching as
the Voice RT, and it raises difficult new questions about voice
recordings. The NSA has always had broad access to US phone
infrastructure, something driven home by the early Snowden
documents, but the last few years have seen an explosion of voice
assistants like the Amazon Echo and Google Home, each of which
floods more voice audio into the cloud where it could be vulnerable
to NSA interception. Is home assistant data a target for the NSA’s
voice scanning program? And if so, are Google and Amazon doing
enough to protect users?
previous cases, law enforcement has chiefly been interested in
obtaining specific incriminating data picked up by a home assistant.
Bentonville murder caselast year, police
sought recordings or transcripts from a specific Echo, hoping the
device might have triggered accidentally during a pivotal moment. If
that tactic worked consistently, it might be a privacy concern for
Echo and Google Home owners — but it almost never does. Devices like
the Echo and Google Home only retain data after hearing their wake
word (“Okay Google” or “Alexa”), which means all police would get is
a list of intentional commands. Security researchers have been
trying to break past that wake-word safeguard for years, but so far,
they can’t do it withoutan
in-person firmware hack, at which point you might as well just
install your own microphone.
the NSA’s tool would be after a person’s voice instead of any
particular words, which would make the wake-word safeguard much less
of an issue. If you can get all the voice commands sent back to
Google or Amazon servers, you’re guaranteed a full profile of the
device owner’s voice, and you might even get an errant houseguest in
the background. And because speech-to-text algorithms are still
relatively new, both Google and Amazon keep audio files in the cloud
as a way to catalog transcription errors. It’s a lot of data, andThe
Interceptis right to think that it would
make a tempting target for the NSA.
police try to collect recordings from a voice assistant, they have
to play by roughly the same warrant rules as your email or Dropbox
files — but the NSA might have a way to get around the warrant too.
Collecting the data would still require a court order (in the NSA’s
case, one approved by the FISA court), but the data wouldn’t
necessarily need to be collected. In theory, the NSA could appeal to
platforms to scan their own archives, arguing they would be helping
to locate a dangerous terrorist. It would be similar to the scans
companies already run for child abuse, terrorism or
copyright-protected material on their networks, all of which are
largely voluntary. If companies complied, the issue could be kept
out of conventional courts entirely.
Gidari, director of privacy at the Stanford Center for Internet and
Society, says that kind of standoff is an inherent problem when
platforms are storing biometric-friendly data. After years of sealed
litigation, it’s still unclear how much help the government has a
right to compel. “To the extent platforms store biometrics, they are
vulnerable to government demands for access and disclosure,” says
Gidari. “I think the government could obtain a technical assistance
order to facilitate the scan, and under [the
technical assistance provisionin] FISA,
perhaps to build the tool, too.”
still don’t have any real evidence that those orders are being
speaks to is how the program worked within the NSA, and no one at
Google or Amazon has ever suggested something like this might be
possible. But there’s still good reason to be suspicious: if such
order were delivered to a tech company, it would probably come witha
gag orderpreventing them from talking about
what they’d done.
far, there’s been little transparency about how much data agencies
are getting from personal voice assistants, if any. Amazon has been
noticeably shifty aboutlisting
requests for Echo datain its transparency
report. Google treats the voice recordings asgeneral
user data, and doesn’t break out requests that are specific to
Google Home. Reached for comment, an Amazon representative said the
company “will not release customer information without a valid and
binding legal demand properly served on us.”
most ominous sign is how much data personal assistants are still
retaining. There’s no technical reason to store audio of every
request by default, particularly if it poses a privacy risk. If
Google and Amazon wanted to decrease the threat, they could stop
logging requests under specific users, tying them instead to an
anonymous identifier as Siri does. Failing that, they could retain
text instead of audio, or even process the speech-to-text conversion
on the device itself.
the Echo and the Home weren’t made with the NSA in mind. Google and
Amazon were trying to build useful assistants, and they likely
didn’t consider that it could also be a tool of surveillance. Even
more, they didn’t consider that a person’s voice might be something
they would have to protect. Likead-targetingand
cloud hosting itself, what started as information technology is
turning into a system of surveillance and control. What happens next
is up to Google, Amazon, and their customers.