Abstract
Automated separation of the constituent signals of complex mixtures of sound has made significant progress over the last two decades. Unfortunately, completing this task in real rooms, where echoes and reverberation are prevalent, continues to present a significant challenge. Conversely, humans demonstrate a remarkable robustness to reverberation. An overview is given of a project that set out to model some of the aspects of human auditory perception in order to improve the efficacy of machine sound source separation in real rooms. Using this approach, the models that were developed achieved a significant improvement in separation performance. The project also showed that existing models of human auditory perception are markedly incomplete and work is currently being undertaken to model additional aspects that had previously been neglected. Work completed so far has shown that an even greater improvement in separation performance will be possible. The work could have many applications, including intelligent hearing aids and intelligent security cameras, and could be incorporated in to many other products that perform automated listening tasks, such as speech recognition, speech enhancement, noise reduction and medical transcription.