Steven Greenberg

Biological Foundations of Speech and Visual Information Processing by Humans and Machines

Should machine learning emulate human brain function to “solve” challenging problems in speech and visual information processing? This presentation surveys recent developments in neuroscience of relevance to machine-learning approaches to speech technology and image processing. Of particular interest is the role of memory and representational systems in recognition, as well as the hierarchical structure of cortical oscillations. Signal variability and semantic complexity are crucial properties of the “real world” that require inherently robust, yet dynamical flexible architectures to successfully decode and “understand.” The sensory and motor systems collaborate closely with higher cognitive centers in processing speech and visual information. Although the specific nature of this neural interchange is poorly understood, it is likely key to how the brain performs the recognition process so quickly and so well. Advances in this aspect of neuroscience could be exploited to enhance machine learning used in a broad range of technologies, including automatic speech recognition and visual image processing. [Research supported by AFOSR]