CUDA Spotlight: GPU-Accelerated Speech Recognition

Ian-Lane-CMUThis week’s Spotlight is on Dr. Ian Lane of Carnegie Mellon University. Ian is an Assistant Research Professor and leads a speech and language processing research group based in Silicon Valley. He co-directs the CUDA Center of Excellence at CMU with Dr. Kayvon Fatahalian.

The following is an excerpt from our interview (read the complete Spotlight here).

NVIDIA: Ian, what is Speech Recognition?
Ian: Speech Recognition refers to the technology that converts an audio signal into the sequence of words that the user spoke. By analyzing the frequencies within a snippet of audio, we can determine what sounds within spoken language a snippet most closely matches, and by observing sequences of these snippets we can determine what words or phrases the user most likely uttered.

Speech Recognition spans many research fields, including signal processing, computational linguistics, machine learning and core problems in computer science, such as efficient algorithms for large-scale graph traversal. Speech Recognition also is one of the core technologies required to realize natural Human Computer Interaction (HCI). It is becoming a prevalent technology in interactive systems being developed today.

NVIDIA: What are examples of real-world applications?
Ian: In recent years, speech-based interfaces have become much more prevalent, including applications such as virtual personal assistants, which include systems such as Siri from Apple or Google Voice Search, as well as speech interfaces for smart TVs and in-vehicle systems.