DOI: 10.3724/SP.J.1042.2017.00757

Advances in Psychological Science (心理科学进展) 2017/25:5 PP.757-768

The characteristics and mechanisms of audiovisual integration: Evidence from mismatch negativity

Audiovisual integration refers to the cognitive process in which auditory and visual events that are presented at the same time and the same location tend to integrate with each other. Mismatch Negativity (MMN), as an index of early cortical processing, is usually used to explore whether there is mismatch between deviant information and traces of sensory memory. Previous studies on audiovisual integration mainly focused on the integration of letters and phonology, integration of non-language visual information and prosodic auditory information, Mcgurk Effect, as well as the mechanisms underlying the integration processing. This paper reviewed those recent studies and analyzed the possible factors that would influence the integration of auditory and visual processing. Future research should focus on the integration of information from multiple modalities.

Key words:auditory,visual,audiovisual integration,MMN,ERP

ReleaseDate:2017-05-31 14:49:38

梁静, 李开云, 曲方炳, 陈宥辛, 颜文靖, 傅小兰. (2014). 说谎的非言语视觉线索. 心理科学进展, 22, 995–1005.

任桂琴, 韩玉昌, 周永垒, 任延涛. (2011). 汉语语调早期加工的脑机制. 心理学报, 43, 241–248.

任桂琴, 刘颖, 于泽. (2012). 汉语口语韵律的作用及其神经机制. 心理科学进展, 20, 338–343.

文小辉, 李国强, 刘强. (2011). 视听整合加工及其神经机制. 心理科学进展, 19, 976–982.

于泽, 韩玉昌, 任桂琴. (2010). 韵律在语言加工中的作用及其神经机制. 心理科学进展, 18, 420–425.

Alain, C., & Woods, D. L. (1997). Attention modulates auditory pattern memory as indexed by event-related brain potentials. Psychophysiology, 34, 534–546.

Andres, A. J. D., Cardy, J. E. O., & Joanisse, M. F. (2011). Congruency of auditory sounds and visual letters modulates mismatch negativity and p300 event-related potentials. International Journal of Psychophysiology, 79, 137–146.

Andrés, P., Guerrini, C., Phillips, L. H., & Perfect, T. J. (2008). Differential effects of aging on executive and automatic inhibition. Developmental Neuropsychology, 33, 101–123.

Arnal, L. H., Wyart, V., & Giraud, A. -L. (2011). Transitions in neural oscillations reflect prediction errors generated in audiovisual speech. Nature Neuroscience, 14, 797–801.

Bottari, D., Heimler, B., Caclin, A., Dalmolin, A., Giard, M. -H., & Pavani, F. (2014). Visual change detection recruits auditory cortices in early deafness. NeuroImage, 94, 172–184.

Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A., & Ghazanfar, A. A. (2009). The natural statistics of audiovisual speech. PLoS Computational Biology, 5, e1000436.

Choudhury, N. A., Parascando, J. A., & Benasich, A. A. (2015). Effects of presentation rate and attention on auditory discrimination: A comparison of long-latency auditory evoked potentials in school-aged children and adults. PLoS One, 10, e0138160.

Cleary, K. M., Donkers, F. C. L., Evans, A. M., & Belger, A. (2013). Investigating developmental changes in sensory processing: Visual mismatch response in healthy children. Frontiers in Human Neuroscience, 7, 922.

Csukly, G., Stefanics, G., Komlósi, S., Czigler, I., & Czobor, P. (2013). Emotion-related visual mismatch responses in schizophrenia: Impairments and correlations with emotion recognition. PLoS One, 8, e75444.

Erlbeck, H., Kübler, A., Kotchoubey, B., & Veser, S. (2014). Task instructions modulate the attentional mode affecting the auditory MMN and the semantic N400. Frontiers in Human Neuroscience, 8, 654.

Eskelund, K., MacDonald, E. N., & Andersen, T. S. (2015). Face configuration affects speech perception: Evidence from a McGurk mismatch negativity study. Neuropsychologia, 66, 48–54.

Esteve-Gibert, N., Prieto, P., & Pons, F. (2015). Nine- month-old infants are sensitive to the temporal alignment of prosodic and gesture prominences. Infant Behavior and Development, 38, 126–129.

Froyen, D., Willems, G., & Blomert, L. (2011). Evidence for a specific cross-modal association deficit in dyslexia: An electrophysiological study of letter–speech sound processing. Developmental Science, 14, 635–648.

González-Fuente, S., Escandell-Vidal, V., & Prieto, P. (2015). Gestural codas pave the way to the understanding of verbal irony. Journal of Pragmatics, 90, 26–47.

Guerreiro, M. J. S., Murphy, D. R., & van Gerven, P. W. M. (2010). The role of sensory modality in age-related distraction: A critical review and a renewed view. Psychological Bulletin, 136, 975–1022.

Guerreiro, M. J. S., Murphy, D. R., & van Gerven, P. W. M. (2013). Making sense of age-related distractibility: The critical role of sensory modality. Acta Psychologica, 142, 184–194.

Guerreiro, M. J. S., & van Gerven, P. W. M. (2011). Now you see it, now you don't: Evidence for age-dependent and age-independent cross-modal distraction. Psychology and Aging, 26, 415–426.

Hedge, C., Stothart, G., Jones, J. T., Frías, P. R., Magee, K. L., & Brooks, J. C. W. (2015). A frontal attention mechanism in the visual mismatch negativity. Behavioural Brain Research, 293, 173–181.

Holloway, I. D., van Atteveldt, N., Blomert, L., & Ansari, D. (2015). Orthographic dependency in the neural correlates of reading: Evidence from audiovisual integration in English readers. Cerebral Cortex, 25, 1544–1553.

Huotilainen, M., Lovio, R., Kujala, T., Tommiska, V., Karma, K., & Fellman, V. (2011). Could audiovisual training be used to improve cognition in extremely low birth weight children? Acta Paediatrica, 100, 1489–1494.

Ikeda, K., Akiyama, H., Iritani, S., Koichi, K., Arai, T., Niizato, K., .... Kosaka, K. (1996). Corticobasal degeneration with primary progressive aphasia and accentuated cortical lesion in superior temporal gyrus: Case report and review. Acta Neuropathologica, 92, 534– 539.

Jones, M. W., Kuipers, J. R., & Thierry, G. (2016). ERPs reveal the time-course of aberrant visual-phonological binding in developmental dyslexia. Frontiers in Human Neuroscience, 10, 71.

Jost, L. B., Eberhard-Moscicka, A. K., Frisch, C., Dellwo, V., & Maurer, U. (2014). Integration of spoken and written words in beginning readers: A topographic ERP study. Brain Topography, 27, 786–800.

Kekoni, J., Hämäläinen, H., Saarinen, M., Gröhn, J., Reinikainen, K., Lehtokoski, A., & Näätänen, R. (1997). Rate effect and mismatch responses in the somatosensory system: ERP-recordings in humans. Biological Psychology, 46, 125–142.

Koelewijn, T., Bronkhorst, A., & Theeuwes, J. (2010). Attention and the multiple stages of multisensory integration: A review of audiovisual studies. Acta Psychologica, 134, 372–384.

Kreegipuu, K., Kuldkepp, N., Sibolt, O., Toom, M., Allik, J., & Näätänen, R. (2013). vMMN for schematic faces: Automatic detection of change in emotional expression. Frontiers in Human Neuroscience, 7, 714.

Kushnerenko, E., Teinonen, T., Volein, A., & Csibra, G. (2008). Electrophysiological evidence of illusory audiovisual speech percept in human infants. Proceedings of the National Academy of Sciences of the United States of America, 105, 11442–11445.

Leiva, A., Parmentier, F. B., & Andrés, P. (2015). Aging increases distraction by auditory oddballs in visual, but not auditory tasks. Psychological Research, 79, 401–410.

Leppänen, P. H. T., Richardson, U., Pihko, E., Eklund, K. M., Guttorm, T. K., Aro, M., & Lyytinen, H. (2002). Brain responses to changes in speech sound durations differ between infants with and without familial risk for dyslexia. Developmental Neuropsychology, 22, 407–422.

Li, X., Yang, Y., & Ren, G. (2009). Immediate integration of prosodic information from speech and visual information from pictures in the absence of focused attention: A mismatch negativity study. Neuroscience, 161, 59–66.

Lu, Y., Paraskevopoulos, E., Herholz, S. C., Kuchenbuch, A., & Pantev, C. (2014). Temporal processing of audiovisual stimuli is enhanced in musicians: Evidence from magnetoencephalography (MEG). PLoS One, 9, e90686.

Matusz, P. J., Retsa, C., & Murray, M. M. (2016). The context-contingent nature of cross-modal activations of the visual cortex. NeuroImage, 125, 996–1004.

McDonald, J. J., Störmer, V. S., Martinez, A., Feng, W. F., & Hillyard, S. A. (2013). Salient sounds activate human visual cortex automatically. The Journal of Neuroscience, 33, 9194–9201.

McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264, 746–748.

Mingjin, H., Hasko, S., Schulte-Körne, G., & Bruder, J. (2012). Automatic integration of auditory and visual information is not simultaneous in Chinese. Neuroscience Letters, 527, 22–27.

Mittag, M., Alho, K., Takegata, R., Makkonen, T., & Kujala, T. (2013). Audiovisual attention boosts letter-speech sound integration. Psychophysiology, 50, 1034–1044.

Munhall, K. G., Jones, J. A., Callan, D. E., Kuratate, T., & Vatikiotis-Bateson, E. (2004). Visual prosody and speech intelligibility: Head movement improves auditory speech perception. Psychological Science, 15, 133–137.

Näätänen, R. (1988). Implications of ERP data for psychological theories of attention. Biological Psychology, 26, 117–163.

Näätänen, R. (1990). The role of attention in auditory information processing as revealed by event-related potentials and other brain measures of cognitive function. Behavioral and Brain Sciences, 13, 201–233.

Näätänen, R., Astikainen, P., Ruusuvirta, T., & Huotilainen, M. (2010). Automatic auditory intelligence: An expression of the sensory–cognitive core of cognitive processes. Brain Research Reviews, 64, 123–136.

Näätänen, R., Gaillard, A. W. K., & Mäntysalo, S. (1978). Early selective-attention effect on evoked potential reinterpreted. Acta Psychologica, 42, 313–329.

Näätäneiv, R., Lehtokoski, A., Lennes, M., Cheour, M., Huotilainen, M., Iivonen, A., … Alho, K. (1997). Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385, 432– 434.

Näätänen, R., Paavilainen, P., Titinen, H., Jiang, D., & Alho, K. (1993). Attention and mismatch negativity. Psychophysiology, 30, 436–450.

Näätänen, R., Pakarinen, S., Rinne, T., & Takegata, R. (2004). The mismatch negativity (MMN): Towards the optimal paradigm. Clinical Neurophysiology, 115, 140– 144.

Näätänen, R., Sussman, E. S., Salisbury, D., & Shafer, V. L. (2014). Mismatch negativity (MMN) as an index of cognitive dysfunction. Brain Topography, 27, 451–466.

Nair, A. K., Sasidharan, A., John, J. P., Mehrotra, S., & Kutty, B. M. (2016). Assessing neurocognition via gamified experimental logic: A novel approach to simultaneous acquisition of multiple ERPs. Frontiers in Neuroscience, 10, 1.

Nakagawa, Y., Hoshiyama, M., Uemura, J.-I., & Jomori, I. (2012). Auditory MEG mismatch responses modified by visual stimulation accompanying auditory stimulation. Neurophysiology, 44, 247–254.

Pascual-Marqui, R. D., Michel, C. M., & Lehmann, D. (1994). Low resolution electromagnetic tomography: A new method for localizing electrical activity in the brain. International Journal of Psychophysiology, 18, 49–65.

Prieto, P., Puglesi, C., Borràs-Comes, J., Arroyo, E., & Blat, J. (2015). Exploring the contribution of prosody and gesture to the perception of focus using an animated agent. Journal of Phonetics, 49, 41–54.

Regenbogen, C., De Vos, M., Debener, S., Turetsky, B. I., Mößnang, C., Finkelmeyer, A., .... Kellermann, T. (2012). Auditory processing under cross-modal visual load investigated with simultaneous EEG-fMRI. PLoS One, 7, e52267.

Remez, R. E., Rubin, P. E., Pisoni, D. B., & Carrell, T. D. (1981). Speech perception without traditional speech cues. Science, 212, 947–949.

Roberts, G., Anderson, P. J., & Doyle, L. W. (2009). Neurosensory disabilities at school age in geographic cohorts of extremely low birth weight children born between the 1970s and the 1990s. The Journal of Pediatrics, 154, 829–834.e1.

Ruhnau, P., Herrmann, B., Maess, B., Brauer, J., Friederici, A. D., & Schröger, E. (2013). Processing of complex distracting sounds in school-aged children and adults: Evidence from EEG and MEG data. Frontiers in Psychology, 4, 717.

Ruhnau, P., Wetzel, N., Widmann, A., & Schröger, E. (2010). The modulation of auditory novelty processing by working memory load in school age children and adults: A combined behavioral and event-related potential study. BMC Neuroscience, 11, 26.

Sams, M., Paavilainen, P., Alho, K., & Näätänen, R. (1985). Auditory frequency discrimination and event-related potentials. Electroencephalography and Clinical Neurophysiology/Evoked Potentials Section, 62, 437–448.

Särkämö, T., Tervaniemi, M., Soinila, S., Autti, T., Silvennoinen, H. M., Laine, M., .... Pihko, E. (2010). Auditory and cognitive deficits associated with acquired amusia after stroke: A magnetoencephalography and neuropsychological follow-up study. PLoS One, 5, e15157.

Shaywitz, S. E., Morris, R., & Shaywitz, B. A. (2008). The education of dyslexic children from childhood to young adulthood. Annual Review of Psychology, 59, 451–475.

Shtyrov, Y., Goryainova, G., Tugin, S., Ossadtchi, A., & Shestakova, A. (2013). Automatic processing of unattended lexical information in visual oddball presentation: Neurophysiological evidence. Frontiers in Human Neuroscience, 7, 421.

Stefanics, G., Kimura, M., & Czigler, I. (2011). Visual mismatch negativity reveals automatic detection of sequential regularity violation. Frontiers in Human Neuroscience, 5, 46.

Stekelenburg, J. J., & Vroomen, J. (2012). Electrophysiological evidence for a multisensory speech-specific mode of perception. Neuropsychologia, 50, 1425–1431.

Stephens, B. E., & Vohr, B. R. (2009). Neurodevelopmental outcome of the premature infant. Pediatric Clinics of North America, 56, 631–646.

Strelnikov, K., Foxton, J., Marx, M., & Barone, P. (2015). Brain prediction of auditory emphasis by facial expressions during audiovisual continuous speech. Brain Topography, 28, 494–505.

Strömmer, J., Tarkka, I., & Astikainen, P. (2014). Somatosensory mismatch response in young and elderly adults. Frontiers in Aging Neuroscience, 6, 293.

Thompson, P. (1980). Margaret thatcher: A new illusion. Perception, 9, 483–484.

Tse, C. Y., Rinne, T., Ng, K. K., & Penney, T. B. (2013). The functional role of the frontal cortex in pre-attentive auditory change detection. NeuroImage, 83, 870–879.

van Atteveldt, N. M., Blau, V. C., Blomert, L., & Goebel, R. (2010). fMR-adaptation indicates selectivity to audiovisual content congruency in distributed clusters in human superior temporal cortex. BMC Neuroscience, 11, 11.

van der Burg, E., Olivers, C. N. L., Bronkhorst, A. W., & Theeuwes, J. (2008). Pip and pop: Nonspatial auditory signals improve spatial visual search. Journal of Experimental Psychology: Human Perception and Performance, 34, 1053–1065.

Wei, J. H., Chan, T. C., & Luo, Y. J. (2002). A modified oddball paradigm “cross-modal delayed response” and the research on mismatch negativity. Brain Research Bulletin, 57, 221–230.

Weise, A., Grimm, S., Trujillo-Barreto, N. J., & Schröger, E. (2014). Timing matters: The processing of pitch relations. Frontiers in Human Neuroscience, 8, 387.

Wiens, S., Szychowska, M., & Nilsson, M. E. (2016). Visual task demands and the auditory mismatch negativity: An empirical study and a meta-analysis. PLoS One, 11, e0146567.

Willems, R. M., Özyürek, A., & Hagoort, P. (2008). Seeing and hearing meaning: ERP and fMRI evidence of word versus picture integration into a sentence context. Journal of Cognitive Neuroscience, 20, 1235–1249.

Yang, X. X., Yu, Y. M., Chen, L., Sun, H. L., Qiao, Z. X., Qiu, X. H., .... Yang, Y. J. (2016). Gender differences in pre-attentive change detection for visual but not auditory stimuli. Clinical Neurophysiology, 127, 431–441.

Žarić, G., González, G. F., Tijms, J., van der Molen, M. W., Blomert, L., & Bonte, M. (2014). Reduced neural integration of letters and speech sounds in dyslexic children scales with individual differences in reading fluency. PLoS One, 9, e110337.

Žarić, G., González, G. F., Tijms, J., van der Molen, M. W., Blomert, L., & Bonte, M. (2015). Crossmodal deficit in dyslexic children: Practice affects the neural timing of letter-speech sound integration. Frontiers in Human Neuroscience, 9, 369.