Interpretable Long-term Action Quality Assessment

Xu Dong; Xinran Liu; Wanqing Li; Anthony Adeyemi-Ejeye; Andrew Gilbert

Back

Interpretable Long-term Action Quality Assessment

Conference proceeding

Open access

Peer reviewed

Interpretable Long-term Action Quality Assessment

Xu Dong, Xinran Liu, Wanqing Li, Anthony Adeyemi-Ejeye and Andrew Gilbert

British Machine Vision Conference, 35 (Glasgow, 25/11/2024–28/11/2024)

2025

Abstract

Long-term Action Quality Assessment (AQA) evaluates the execution of activities in videos. However, the length presents challenges in fine-grained interpretability, with current AQA methods typically producing a single score by averaging clip features, lacking detailed semantic meanings of individual clips. Long-term videos pose additional difficulty due to the complexity and diversity of actions, exacerbating interpretability challenges. While query-based transformer networks offer promising long-term modelling capabilities, their interpretability in AQA remains unsatisfactory due to a phenomenon we term Temporal Skipping, where the model skips self-attention layers to prevent output degradation. To address this, we propose an attention loss function and a query initialization method to enhance performance and interpretability. Additionally, we introduce a weight-score regression module designed to approximate the scoring patterns observed in human judgments and replace conventional single-score regression, improving the rationality of interpretability. Our approach achieves state-of-the-art results on three real-world, long-term AQA benchmarks.

Files and links (1)

pdf

IntrpretAQActionPaper10.98 MBDownload View

Author's Accepted Manuscript Open Access

Metrics

25 File views/ downloads

89 Record Views

Details

Title: Interpretable Long-term Action Quality Assessment
Creators: Xu Dong - University of Surrey, Music and Media
Xinran Liu - University of Surrey, School of Computer Science & Electronic Engineering
Wanqing Li - University of Wollongong
Anthony Adeyemi-Ejeye - University of Surrey, Music and Media
Andrew Gilbert - University of Surrey, Music and Media
Conference: British Machine Vision Conference, 35 (Glasgow, 25/11/2024–28/11/2024)
Publisher: British Machine Vision Association
Publication Date: 2025
Date accepted for publication: 07/10/2024
Identifiers: 99925862102346
Academic Unit: Music and Media
Resource Type: Conference proceeding

Interpretable Long-term Action Quality Assessment

Abstract

Files and links (1)

Metrics

Details

Usage Policy