When listening to someone in a noisy environment, such as a cocktail party, we can understand the speaker more easily if we can also see his or her face. Movements of the lips and tongue convey additional information that helps the listener’s brain separate out syllables, words and sentences.
But exactly where in the brain this effect occurs and how it works remain unclear.
To investigate, Bruno L Giordano, of the France Institute of Neuroscience and Psychology, University of Glasgow, and colleagues, scanned the brains of healthy volunteers as they watched clips of people speaking. The clarity of the speech varied between clips.
Furthermore, in some of the clips the lip movements of the speaker corresponded to the speech in question, whereas in others the lip movements were nonsense babble. As expected, the volunteers performed better on a word recognition task when the speech was clear and when the lips movements agreed with the spoken dialogue.
Watching the video clips stimulated rhythmic activity in multiple regions of the volunteers’ brains, including areas that process sound and areas that plan movements.
Speech is itself rhythmic, and the volunteers’ brain activity synchronized with the rhythms of the speech they were listening to. Seeing the speaker’s face increased this degree of synchrony.
However, it also made it easier for sound-processing regions within the listeners’ brains to transfer information to one other.
Notably, only the latter effect predicted improved performance on the word recognition task. This suggests that seeing a person’s face makes it easier to understand his or her speech by boosting communication between brain regions, rather than through effects on individual areas.
Further work is required to determine where and how the brain encodes lip movements and speech sounds. The next challenge will be to identify where these two sets of information interact, and how the brain merges them together to generate the impression of specific words.
Bruno L Giordano, Robin A A Ince, Joachim Gross, Philippe G Schyns, Stefano Panzeri, Christoph Kayser
Contributions of local speech encoding and functional connectivity to audio-visual speech perception
eLife 2017;6:e24763 doi: 10.7554/eLife.24763
© 2017 eLife Sciences Publications Ltd. Republished via Creative Commons Attribution license. Top Image: israeltourism/Flickr