Speech comprehension in noisy environments is greatly improved by the availability of visual information, i.e. lip and facial movements. This suggests that the brain is in possession of an audiovisual integration mechanism supplementing speech comprehension. To reveal the respective brain areas, we used a step-wise functional neuroimaging approach in healthy German speaking subjects. First, a functional localizer session using a block design with auditory speech, visual speech (lip movements, no audio), and audiovisual speech (audio and video-signal in sync) was used to identify areas activated by auditory and visual speech, which showed in addition further increased activity in the audiovisual condition. This procedure revealed two clusters of brain activity in the posterior part of the superior temporal sulcus bilaterally. In a second session using slow-event-related imaging in conjunction with sparse sampling, these functionally defined volumes were further examined in a design crossing audiovisual congruity and intelligibility (with/without added noise). Within these areas, regions showing an interaction of audiovisual congruity and intelligibility were found with the greatest activity for incongruent speech stimuli with added noise. This underscores their role in audiovisual integration. We propose that the stepwise approach introduced here allows a finer analysis of audiovisual speech integration than previous methods.