Abstract
Human sketch has already proved its worth in various visual understanding
tasks (e.g., retrieval, segmentation, image-captioning, etc). In this paper, we
reveal a new trait of sketches - that they are also salient. This is intuitive
as sketching is a natural attentive process at its core. More specifically, we
aim to study how sketches can be used as a weak label to detect salient objects
present in an image. To this end, we propose a novel method that emphasises on
how "salient object" could be explained by hand-drawn sketches. To accomplish
this, we introduce a photo-to-sketch generation model that aims to generate
sequential sketch coordinates corresponding to a given visual photo through a
2D attention mechanism. Attention maps accumulated across the time steps give
rise to salient regions in the process. Extensive quantitative and qualitative
experiments prove our hypothesis and delineate how our sketch-based saliency
detection model gives a competitive performance compared to the
state-of-the-art.