Abstract
This paper proposes a learnt {____em data-driven} approach for accurate, real-time tracking of facial features using only intensity information, a non-trivial task since the face is a highly deformable object with large textural variations and motion in certain regions. The framework proposed here largely avoids the need for apriori design of feature trackers by automatically identifying the optimal visual support required for tracking a single facial feature point. This is essentially equivalent to automatically determining the visual context required for tracking. Tracking is achieved via biased linear predictors which provide a fast and effective method for mapping pixel-intensities into tracked feature position displacements. Multiple linear predictors are grouped into a rigid flock to increase robustness. To further improve tracking accuracy, a novel probabilistic selection method is used to identify relevant visual areas for tracking a feature point. These selected flocks are then combined into a hierarchical multi-resolution LP model. Finally, we also exploit a simple shape constraint for correcting the occasional tracking failure of a minority of feature points. Experimental results also show that this method performs more robustly and accurately than AAMs, on example sequences that range from SD quality to Youtube quality.