Surrey researchers Sign in
Finding the Right Words: Investigating Machine-Generated Video Description Quality Using a Corpus-Based Approach
Journal article   Open access  Peer reviewed

Finding the Right Words: Investigating Machine-Generated Video Description Quality Using a Corpus-Based Approach

SABINE BRAUN and Kim Linda Starr
Journal of Audiovisual Translation
31/12/2019

Abstract

computer vision, machine learning, accessibility, audiovisual content, audio description, content description, content retrieval, video description, audiovisual translation, MeMAD
pdf
document(11)980.45 kBDownloadView
Open Access
url
https://www.jatjournal.org/index.php/jat/article/view/103View
This paper examines first steps in identifying and compiling human-generated corpora for the purpose of determining the quality of computer-generated video descriptions. This is part of a study whose general ambition is to broaden the reach of accessible audiovisual content through semi-automation of its description for the benefit of both end-users (content consumers) and industry professionals (content creators). Working in parallel with machine-derived video and image description datasets created for the purposes of advancing computer vision research, such as Microsoft COCO (Lin et al., 2015) and TGIF (Li et al., 2016), we examine the usefulness of audio descriptive texts as a direct comparator. Cognisant of the limitations of this approach, we also explore alternative human-generated video description datasets including bespoke content description. Our research forms part of the MeMAD (Methods for Managing Audiovisual Data) project, funded by the EU Horizon 2020 programme.

Metrics

39 File views/ downloads
64 Record Views

Details

Usage Policy