Abstract
We describe a semi-automatic video logging system, ca- pable of annotating frames with semantic metadata describ- ing the objects present. The system learns by visual exam- ples provided interactively by the logging operator, which are learned incrementally to provide increased automation over time. Transfer learning is initially used to bootstrap the sys- tem using relevant visual examples from ImageNet. We adapt the hard-assignment Bag of Word strategy for object recogni- tion to our interactive use context, showing transfer learning to significantly reduce the degree of interaction required.