Logo image
Open Research University homepage
Surrey researchers Sign in
Comparing Visual Features for Lipreading
Conference presentation   Open access

Comparing Visual Features for Lipreading

Y Lan, R Harvey, B Theobald, EJ Ong and R Bowden
International Conference on Auditory-Visual Speech Processing 2009, pp.102-106
AVSP 2009 (Norwich, UK, 10/09/2009–13/09/2009)
2009

Abstract

For automatic lipreading, there are many competing methods for feature extraction. Often, because of the complexity of the task these methods are tested on only quite restricted datasets, such as the letters of the alphabet or digits, and from only a few speakers. In this paper we compare some of the leading methods for lip feature extraction and compare them on the GRID dataset which uses a constrained vocabulary over, in this case, 15 speakers. Previously the GRID data has had restricted attention because of the requirements to track the face and lips accurately. We overcome this via the use of a novel linear predictor (LP) tracker which we use to control an Active Appearance Model (AAM). By ignoring shape and/or appearance parameters from the AAM we can quantify the effect of appearance and/or shape when lip-reading. We find that shape alone is a useful cue for lipreading (which is consistent with human experiments). However, the incremental effect of shape on appearance appears to be not significant which implies that the inner appearance of the mouth contains more information than the shape.
pdf
LanAVSP091.31 MBDownloadView
TextSRIDA Open Access

Metrics

Details

Logo image

Usage Policy