By Simon Brandon, Freelance journalist
A prankster who made repeated hoax distress calls to the US Coast Guard over the course of 2014 probably thought they were untouchable. They left no fingerprints or DNA evidence behind, and made sure their calls were too brief to allow investigators to triangulate their location.
Unfortunately for this hoaxer, however, voice analysis powered by AI is now so advanced that it can reveal far more about you than a mere fingerprint. By using powerful technology to analyse recorded speech, scientists today can make confident predictions about everything from the speaker’s physical characteristics – their height, weight, facial structure and age, for example – to their socioeconomic background, level of income and even the state of their physical and mental health.
One of the leading scientists in this field is Rita Singh of Carnegie Mellon University’s Language Technologies Institute. When the US Coast Guard sent her recordings of the 2014 hoax calls, Singh had already been working in voice recognition for 20 years. “They said, ‘Tell us what you can’,” she told the Women in Tech Show podcast earlier this year. “That’s when I started looking beyond the signal. How much could I tell the Coast Guard about this person?”
What your voice says about you
The techniques developed by Singh and her colleagues at Carnegie Mellon analyse and compare tiny differences, imperceptible to the human ear, in how individuals articulate speech. They then break recorded speech down into tiny snippets of audio, milliseconds in duration, and use AI techniques to comb through these snippets looking for unique identifiers.
Your voice can give away plenty of environmental information, too. For example, the technology can guess the size of the room in which someone is speaking, whether it has windows and even what its walls are made of. Even more impressively, perhaps, the AI can detect signatures left in the recording by fluctuations in the local electrical grid, and can then match these to specific databases to give a very good idea of the caller’s physical location and the exact time of day they picked up the phone.
This all applies to a lot more than hoax calls, of course. Federal criminal cases from harassment to child abuse have been helped by this relatively recent technology. “Perpetrators in voice-based cases have been found, have confessed, and their confessions have largely corroborated our analyses,” says Singh.
Portraits in 3D
And they’re just getting started: Singh and her fellow researchers are developing new technologies that can provide the police with a 3D visual portrait of a suspect, based only on a voice recording. “Audio can us give a facial sketch of a speaker, as well as their height, weight, race, age and level of intoxication,” she says.
But there’s some way to go before voice-based profiling technology of this kind becomes viable in a court. Singh explains: “In terms of admissibility, there will be questions. We’re kind of where we were with DNA in 1987, when the first DNA-based conviction took place in the United States.”
This has all proved to be bad news for the Coast Guard’s unsuspecting hoaxer. Making prank calls to emergency services in the US is regarded as a federal crime, punishable by hefty fines and several years of jail time; and usually the calls themselves are the only evidence available. Singh was able to produce a profile that helped the Coast Guard to eliminate false leads and identify a suspect, who they hope to bring a prosecution soon.
Given the current exponential rate of technological advancement, it’s safe to say this technology will become much more widely used by law enforcement in the future. And for any potential hoax callers reading this: it’s probably best to stick to the old cut-out newsprint and glue method for now. Just don’t leave any fingerprints.