{"id":50856,"date":"2014-12-08T13:49:16","date_gmt":"2014-12-08T13:49:16","guid":{"rendered":"https:\/\/www.transcend.org\/tms\/?p=50856"},"modified":"2015-05-05T21:27:12","modified_gmt":"2015-05-05T20:27:12","slug":"what-happens-when-spies-can-eavesdrop-on-any-conversation","status":"publish","type":"post","link":"https:\/\/www.transcend.org\/tms\/2014\/12\/what-happens-when-spies-can-eavesdrop-on-any-conversation\/","title":{"rendered":"What Happens When Spies Can Eavesdrop on Any Conversation?"},"content":{"rendered":"<p><strong><a href=\"https:\/\/www.transcend.org\/tms\/wp-content\/uploads\/2014\/12\/defense-large-spying-usa-nsa-cghq-Eavesdrop.jpg\" ><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-50857\" src=\"https:\/\/www.transcend.org\/tms\/wp-content\/uploads\/2014\/12\/defense-large-spying-usa-nsa-cghq-Eavesdrop.jpg\" alt=\"defense-large spying usa nsa cghq Eavesdrop\" width=\"710\" height=\"325\" srcset=\"https:\/\/www.transcend.org\/tms\/wp-content\/uploads\/2014\/12\/defense-large-spying-usa-nsa-cghq-Eavesdrop.jpg 710w, https:\/\/www.transcend.org\/tms\/wp-content\/uploads\/2014\/12\/defense-large-spying-usa-nsa-cghq-Eavesdrop-300x137.jpg 300w\" sizes=\"auto, (max-width: 710px) 100vw, 710px\" \/><\/a><\/strong><\/p>\n<p><em>Dec 1, 2014 &#8211; <\/em>Imagine having access to the all of the world\u2019s recorded conversations, videos that people have\u00a0posted to YouTube, in addition to chatter collected by random microphones in public places. Then picture the possibility of searching that dataset for clues related to terms that you are interested in the same way you search Google. You could look up, for example, who was having a conversation right now about plastic explosives, about a particular flight departing from Islamabad, about Islamic State leader Abu Bakr al-Baghdadi in reference to a particular area of Northern\u00a0Iraq.<\/p>\n<p>On Nov. 17, the U.S. announced a new challenge called Automatic Speech recognition in Reverberant Environments, giving it the acronym <a target=\"_blank\" href=\"https:\/\/www.innocentive.com\/ar\/challenge\/9933624?cc=IARPALP3624&amp;utm_source=IARPA&amp;utm_campaign=9933624&amp;utm_medium=landing+page\" >ASpIRE<\/a>. The challenge comes from the Office of the Director of National Intelligence, or ODNI, and the Intelligence Advanced Research Projects Agency, or IARPA. It speaks to a major opportunity for intelligence collection in the years ahead, teaching machines to scan the ever-expanding world of recorded speech. To do that, researchers will need to take a decades\u2019 old technology, computerized speech recognition, and re-invent it from\u00a0scratch.<\/p>\n<p>Importantly, the ASpIRE challenge is only the most recent government research program aimed at modernizing speech recognition for intelligence gathering. The so-called <a target=\"_blank\" href=\"http:\/\/www.iarpa.gov\/index.php\/research-programs\/babel\" >Babel<\/a> program from IARPA, as well as such DARPA\u00a0programs as RATS (Robust Automatic Transcription of Speech), <a target=\"_blank\" href=\"https:\/\/www.ldc.upenn.edu\/collaborations\/current-projects\/bolt\" >BOLT<\/a> (Broad Operational Language Translation) and others have all had similar or related\u00a0objectives.<\/p>\n<p>To understand what the future of speech recognition looks like, and why it doesn\u2019t yet work the way the intelligence community wants it to, it first becomes necessary to know what it is. In a 2013 paper titled \u201cWhat\u2019s Wrong With Speech Recognition\u201d researcher Nelson Morgan defines it as \u201cthe science of recovering words from an acoustic signal meant to convey those words to a human listener.\u201d It\u2019s different from speaker recognition, or matching a voiceprint to a single individual, but the two are\u00a0related.<\/p>\n<p>Speech recognition is focused more precisely on getting a machine to understand speech well enough to instantly transcribe spoken words into text or usable data. Anyone that\u2019s ever used a program like Dragon Naturally Speaking might think that this is a largely solved problem. But most automatic transcribing programs are actually only useful in very few situations, which limits their effectiveness in terms of intelligence\u00a0collection.<\/p>\n<p>It seems like an easy challenge for a military in the process of <a target=\"_blank\" href=\"http:\/\/www.defenseone.com\/technology\/2014\/10\/inside-navys-secret-swarm-robot-experiment\/95813\/\" >outfitting robotic boats with lasers<\/a>, but speech recognition, especially in diverse environments, is incredibly difficult despite decades of steady research and\u00a0funding.<\/p>\n<p><strong>A Brief History of Teaching Machines to\u00a0Listen<\/strong><\/p>\n<p>The United States military, working with Bell Labs, launched research into computerized speech recognition in World War II when the military attempted to use spectrograms, or crude voice prints, to identify enemy voices on the radio. In the 1970s, IBM researcher Fred Jelinek and Carnegie Mellon University researcher Jim Baker, founder of Dragon Systems, spearheaded research to apply a statistical methodology called \u201chidden Markov modeling,\u201d or HMM, to the problem. Their work resulted in a 1982 seminar at the Institute for Defense Analysis in Princeton, New Jersey, which established HMM as the standard method for computerized speech recognition. Various DARPA programs\u00a0followed.<\/p>\n<p>HMM works like this: Imagine you have a friend who works in an office. When his boss comes in late, your friend is more likely to come in late. This is a so-called Markov chain of events. You can\u2019t observe whether or not your friend\u2019s boss is in the office because it\u2019s information that\u2019s hidden from you. But when you call your friend and he tells you he\u2019s not on time you can make an inference about the tardiness of your friend\u2019s boss. Applied to speech recognition, the hidden state might be the thing actually being said but the clues are the sounds that commonly occur\u00a0together.<\/p>\n<p>Hidden Markov modeling has been the standard methodology for speech recognition for decades. Some noted scholars in the field like Berkley\u2019s Nelson Morgan argue that reliance on it is now holding the field back. After all, while facial recognition has advanced tremendously enabling programs to detect faces and match them to databases in an ever-wider number of circumstances, speech recognition has not progressed nearly so\u00a0well.<\/p>\n<p><strong>\u201c<\/strong>In short,\u201d Morgan wrote, \u201cthe speech recognition field has developed a collection of small-scale solutions to very constrained speech problems, and these solutions fail in the world at large. Their failure modes are acute but unpredictable and non-intuitive, thus leaving the technology defective in broad applications and difficult to manage even in well-behaved environments. In short, this technology is badly broken.\u201d<\/p>\n<p>One the most important characteristics of this dysfunctionality is what\u2019s called a lack of\u00a0robustness.<\/p>\n<p>Mary Harper, program manager in charge of the ASpIRE challenge, explained the problem to <em>Defense One<\/em> this way: \u201cMost speech recognition systems are trained to work for specific recording conditions.\u00a0For example, a system trained on speech recorded in a conference room with an acoustic tile ceiling and heavy drapes using a high fidelity microphone won\u2019t work very well on speech recorded in an unfurnished room with no sound-absorbing wall or floor coverings using a different type of\u00a0microphone.&#8221;<\/p>\n<p>What form might those approaches take? Nelson in his paper suggests that today\u2019s leaps in computational neuroscience, which have given rise to a number of interesting artificial intelligence\u00a0applications like Siri, could be applicable to the speech recognition\u00a0problem.<\/p>\n<p>\u201cThere is an existing significant example of speech recognition that actually works well in many adverse conditions, namely, the recognition performed by the human ear and brain. Methods for analyzing functional brain activity have become more sophisticated in recent years, so there are new opportunities for the development of models that better track the desirable properties of human speech perception,\u201d he\u00a0writes.<\/p>\n<p>Once speech data has been rendered as text it\u2019s effectively been structured. That means it becomes far more workable as a dataset, allowing algorithms to crawl it in the same way the Google Search algorithm crawls the text of the world\u2019s web pages. That small breakthrough doesn\u2019t sound like much but it could actually revolutionize information gathering for the intelligence community. In theory, when speech in more different types of environments can be collected and transcribed any conversation happening within ear-shot of a networked microphone could become <em>searchable<\/em> in\u00a0real-time.<\/p>\n<p>For the intelligence community, achieving that sort of capability would require, in addition to better speech recognition software, the ability to collect speech data almost everywhere, particularly in contested areas where the U.S. has no boots on the\u00a0ground.<\/p>\n<p>But getting data collection devices into more places becomes easier with every iPhone purchase, thanks, in part to the Internet of Things. The next wave of interconnected consumer gadgets like Google\u2019s Moto X superphone and the Apple Watch <a target=\"_blank\" href=\"http:\/\/www.apple.com\/watch\/new-ways-to-connect\/\" >coming in 2015<\/a> represent a broad trend in devices that rely on voice commands and speak to users, as Rachel Feltman points out in a piece for <em>Defense One<\/em> sister site <a target=\"_blank\" href=\"http:\/\/qz.com\/209132\/in-the-future-youll-talk-to-all-your-devices-and-youll-need-different-words-for-each-one\/\" >Quartz.<\/a> Are the voice commands that you give your future smart watch legally open to intelligence\u00a0gathering?<\/p>\n<p><a target=\"_blank\" href=\"http:\/\/www.defenseone.com\/politics\/2014\/11\/did-rand-pauls-nsa-vote-fight-government-spying-or-protect-it\/99554\/?oref=search_Freedom%20ACt\" >The defeat of the U.S.A. Freedom<\/a> Act means that the National Security Agency\u00a0can continue to collect meta-data on cell phone users, which can be used to pinpoint location. Depending on <em>where<\/em> you talking to your device, whether in public or in private, a judge may rule you don\u2019t have a reasonable expectation of privacy. But if you\u2019re worried about your device becoming a listening ear for the government, so, too, could the very air around\u00a0you.<\/p>\n<p><strong>Shhh\u2026 The Smart Dust Will Hear\u00a0You<\/strong><\/p>\n<p>The intelligence community in the decades ahead will rely on an ever smaller and capable array of microphones to pick up intel and some border on the unbelievable. Scientists have actually created a <a target=\"_blank\" href=\"http:\/\/journals.aps.org\/prl\/abstract\/10.1103\/PhysRevLett.113.135505#authors\" >microphone that is just <em>one molecule<\/em><\/a> of dibenzoterrylene\u00a0(which changes color depending on pitch.) Devices that pickup noise or vibrations can be as small as a grain of\u00a0rice.<\/p>\n<p>Continued advancement in the field of device miniaturization could one day allow for the dispersal of extremely small but capable listening machines, one of the uses a future technology sometimes called \u201cSmart\u00a0Dust.\u201d<\/p>\n<p>What is the strategic military advantage presented by ubiquitous, tiny listening machines? In a <a target=\"_blank\" href=\"http:\/\/www.au.af.mil\/au\/awc\/awcgate\/cst\/bh_dickson.pdf\" >2007 paper<\/a> (PDF) titled <em>Enabling Battlespace Persistent Surveillance: the Form, Function, and Future of Smart Dust,<\/em> U.S. Air Force Major Scott A. Dickson speculates that future micro-electromechnical systems or MEMS will \u201csense a wide array of information with the processing and communication capabilities to act as independent or networked sensors. Fused together into a network of nanosized particles distributed over the battlefield capable of measuring, collecting, and sending information, Smart Dust will transform persistent surveillance for the warfighter\u00a0[sic].\u201d<\/p>\n<p>The nascent opportunity to turn the physical world into a landscape for surveillance is a theme that\u2019s showing up with growing frequency in scholarly defense literature, such as this <a target=\"_blank\" href=\"http:\/\/ctnsp.dodlive.mil\/files\/2014\/09\/DTP1061.pdf\" >September 2014 paper<\/a> out of National Defense University\u2019s Center for Technology and National Security Policy, which heralds the future opportunities that the Internet of Things provides for the \u201cmonitoring of individuals and populations using\u00a0sensors.\u201d<\/p>\n<p>Before researchers arrive at a searchable soundscape, better speech recognition will help efforts in speaker recognition, attaching a specific voice in a recording to a specific person. IARPA says that speaker recognition isn\u2019t the goal of the current challenge. But that sort of capability has clear and near-term applications for national security.<\/p>\n<p>In more and more conflict areas, big investments in facial recognition are revealing themselves to be of very limited use. Consider <a target=\"_blank\" href=\"http:\/\/www.defenseone.com\/technology\/2014\/04\/science-unmasking-russian-forces-ukraine\/82693\/?oref=search_Unmasking\" >Ukraine<\/a>, where fighters carefully kept their faces hidden from international observers while effectively annexing another country\u2019s territory. Or think of northern Iraq, where jihadists committing barbaric acts do so, often, under\u00a0mask.<\/p>\n<p>Every time a new video from the Islamic State surfaces, intelligence workers are faced with the challenge of matching the voice of the person in the video to that of someone else, <a target=\"_blank\" href=\"http:\/\/www.theguardian.com\/world\/2014\/sep\/25\/fbi-identified-isis-jihadist-beheading-videos\" >someone who once walked the streets<\/a>. Doing so means having a wide sample of voices to compare to the one in the\u00a0video.<\/p>\n<p>Today, companies and law enforcement agencies routinely collect so-called voiceprints on customers and suspects. In 2012, the FBI announced a technology called VoiceGrid to store voice data. Today, the Federal Police in Mexico have a <a target=\"_blank\" href=\"http:\/\/www.homelandsecuritynewswire.com\/dr20120928-law-enforcement-can-store-identify-millions-of-voice-samples-using-new-software\" >database of more than a million<\/a> voice records taken during criminal proceedings and arrests. But the number of voice prints potentially available to law enforcement or the intelligence community surpasses 65 million <a target=\"_blank\" href=\"http:\/\/www.foxbusiness.com\/technology\/2014\/10\/13\/voice-harvesters-bureaucrats-businesses-gather-more-than-65-million-people\/\" >by some recent estimates<\/a>. As large as that number sounds, it will likely grow exponentially as speech recognition, speaker recognition and device miniaturization\u00a0advance.<\/p>\n<p>It\u2019s a trend with clear privacy implications. But the reliance of groups like the Islamic State on anonymity speaks to an intelligence challenge that will persist in the coming decades. War is changing, whether it is waged by emergent groups like the Islamic State\u00a0or nations like Russia, more and more, the potential revelation of identity is becoming a liability in conflict zones. Knowing the name of the person on the other-side of the battlefield is rising as a strategic necessity. That\u2019s what makes continued bugging of the world\u00a0inevitable.<\/p>\n<p>__________________________<\/p>\n<p><em>Patrick Tucker is technology editor for <\/em>Defense One<em>. He\u2019s also the author of <a target=\"_blank\" href=\"http:\/\/www.amazon.com\/The-Naked-Future-Happens-Anticipates\/dp\/1591845866\" >The Naked Future: What Happens in a World That Anticipates Your Every Move? (Current, 2014)<\/a>. Previously, Tucker was deputy editor for <\/em>The Futurist<em>, where he served for nine years. Tucker&#8217;s writing on emerging technology also has appeared in <\/em>Slate, The Sun, MIT Technology Review, Wilson Quarterly, The American Legion Magazine, BBC News Magazine<em> and <\/em>Utne Reader<em> among other publications.<\/em><\/p>\n<p><a target=\"_blank\" href=\"http:\/\/www.defenseone.com\/technology\/2014\/12\/what-happens-when-spies-can-eavesdrop-any-conversation\/100142\/\" >Go to Original \u2013 defenseone.com<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Imagine having access to the all of the world\u2019s recorded conversations, videos that people have posted to YouTube, in addition to chatter collected by random microphones in public places. Then picture the possibility of searching that dataset for clues related to terms that you are interested in the same way you search Google.<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[216],"tags":[],"class_list":["post-50856","post","type-post","status-publish","format-standard","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/www.transcend.org\/tms\/wp-json\/wp\/v2\/posts\/50856","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.transcend.org\/tms\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.transcend.org\/tms\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.transcend.org\/tms\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.transcend.org\/tms\/wp-json\/wp\/v2\/comments?post=50856"}],"version-history":[{"count":0,"href":"https:\/\/www.transcend.org\/tms\/wp-json\/wp\/v2\/posts\/50856\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.transcend.org\/tms\/wp-json\/wp\/v2\/media?parent=50856"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.transcend.org\/tms\/wp-json\/wp\/v2\/categories?post=50856"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.transcend.org\/tms\/wp-json\/wp\/v2\/tags?post=50856"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}