Abstract
The machine cocktail party problem has been researched for several decades. Although many blind source separation schemes have been proposed to address this problem, few of them are tested by using a real room audio video recording. In this paper, we propose an audio video based independent vector analysis (AVIVA) method, and test it with other independent vector analysis methods by using a real room recording dataset, i.e. the AV16.3 corpus. Moreover, we also use a new method based on pitch difference detection for objective evaluation of the separation performance of the algorithms when applied on the real dataset which confirms advantages of using the visual modality with IVA. © 2012 Springer-Verlag.