An examination of more than sixty years of successes and failures in developing technologies that allow computers to understand human spoken language. Stanley Kubrick's 1968 film 2001: A Space Odyssey famously featured HAL, a computer with the ability to hold lengthy conversations with his fellow space travelers. More than forty years later, we have advanced computer techno An examination of more than sixty years of successes and failures in developing technologies that allow computers to understand human spoken language. Stanley Kubrick's 1968 film 2001: A Space Odyssey famously featured HAL, a computer with the ability to hold lengthy conversations with his fellow space travelers. More than forty years later, we have advanced computer technology that Kubrick never imagined, but we do not have computers that talk and understand speech as HAL did. Is it a failure of our technology that we have not gotten much further than an automated voice that tells us to "say or press 1"? Or is there something fundamental in human language and speech that we do not yet understand deeply enough to be able to replicate in a computer? In The Voice in the Machine, Roberto Pieraccini examines six decades of work in science and technology to develop computers that can interact with humans using speech and the industry that has arisen around the quest for these technologies. He shows that although the computers today that understand speech may not have HAL's capacity for conversation, they have capabilities that make them usable in many applications today and are on a fast track of improvement and innovation. Pieraccini describes the evolution of speech recognition and speech understanding processes from waveform methods to artificial intelligence approaches to statistical learning and modeling of human speech based on a rigorous mathematical model--specifically, Hidden Markov Models (HMM). He details the development of dialog systems, the ability to produce speech, and the process of bringing talking machines to the market. Finally, he asks a question that only the future can answer: will we end up with HAL-like computers or something completely unexpected?
The Voice in the Machine: Building Computers That Understand Speech
An examination of more than sixty years of successes and failures in developing technologies that allow computers to understand human spoken language. Stanley Kubrick's 1968 film 2001: A Space Odyssey famously featured HAL, a computer with the ability to hold lengthy conversations with his fellow space travelers. More than forty years later, we have advanced computer techno An examination of more than sixty years of successes and failures in developing technologies that allow computers to understand human spoken language. Stanley Kubrick's 1968 film 2001: A Space Odyssey famously featured HAL, a computer with the ability to hold lengthy conversations with his fellow space travelers. More than forty years later, we have advanced computer technology that Kubrick never imagined, but we do not have computers that talk and understand speech as HAL did. Is it a failure of our technology that we have not gotten much further than an automated voice that tells us to "say or press 1"? Or is there something fundamental in human language and speech that we do not yet understand deeply enough to be able to replicate in a computer? In The Voice in the Machine, Roberto Pieraccini examines six decades of work in science and technology to develop computers that can interact with humans using speech and the industry that has arisen around the quest for these technologies. He shows that although the computers today that understand speech may not have HAL's capacity for conversation, they have capabilities that make them usable in many applications today and are on a fast track of improvement and innovation. Pieraccini describes the evolution of speech recognition and speech understanding processes from waveform methods to artificial intelligence approaches to statistical learning and modeling of human speech based on a rigorous mathematical model--specifically, Hidden Markov Models (HMM). He details the development of dialog systems, the ability to produce speech, and the process of bringing talking machines to the market. Finally, he asks a question that only the future can answer: will we end up with HAL-like computers or something completely unexpected?
Compare
Maeve –
a
Alexis –
This was a very interesting book! I especially enjoyed the last few chapters. The very first chapter was a very quick introduction to linguistics, so I basically skipped it, though I'm sure non-linguists would find it useful. (This does worry me about the rest of the book, as I *know* the linguistics chapter was extremely dumbed down, so no doubt the parts I thought to be interesting were also dumbed down, hah.) I found chapters 2-5 fairly dry. The one on statistics was pretty good. The author r This was a very interesting book! I especially enjoyed the last few chapters. The very first chapter was a very quick introduction to linguistics, so I basically skipped it, though I'm sure non-linguists would find it useful. (This does worry me about the rest of the book, as I *know* the linguistics chapter was extremely dumbed down, so no doubt the parts I thought to be interesting were also dumbed down, hah.) I found chapters 2-5 fairly dry. The one on statistics was pretty good. The author really likes to reference every important person and company who ever worked on a speech recognition project, and this gave me a bit of fatigue ("who is this guy? is he the same as the person mentioned earlier? what are all the acronyms??"). Chapters 6 to 9 were really the high point for me, especially when the author cited specific interesting studies that have helped progress speech recognition technology. Chapter 10 and the epilogue were pretty ho-hum, probably because they were not describing history of ideas so much as current business practices. There was also a chapter just about speech production using computers which I thought was really interesting. Overall I gave this book 3 stars because unless you are already interested in it, it is fairly dry and hard to get into. The writing is very easy to understand, but not terribly good. There are a lot of interesting tidbits, historical notes, academic studies and competitions referenced throughout, and they left me with a lot to consider. One last note, to anyone out there publishing ebooks, please make references easier to reference! This book was technically a PDF, so there was no easy way for me to look at references at the end while in the middle.
Dominic –
Pretty interesting history of speech recognition. Goes into details about how speech recognition works, without getting too technical.
Sergei –
Tony Robinson –
Sophia –
Fergle –
Josh –
John –
Jorge Gonzalez –
Elena –
Lene S. –
Jonathan Bloom –
Mustafa Firik –
Brandon Fosdick –
Jordan Bender –
Roberto Pieraccini –
Piotr –
Henry –
Georgia –
Sergey –
Michael –
Ben –
Constantine Firun –
Raed –
Luis Capelo –
Fernando Cova –
Nikos Tsourakis –
Marciano Moreno –
Tom Johnson –
Alessandro Mariucci –
Keith Martin –
Boris Crismancich –
David Hinnebusch –
Iain –
Vincenzo Belvedere –
Kimberly –
Michael Parker –
Franz Anders –
Vince –