SignRep: Enhancing Self-Supervised Sign Representations

Ryan Cameron Wong; Necati Cihan Camgöz; Richard Bowden

doi:10.48550/arXiv.2503.08529

Back

SignRep: Enhancing Self-Supervised Sign Representations

Conference paper

Open access

SignRep: Enhancing Self-Supervised Sign Representations

Ryan Cameron Wong, Necati Cihan Camgöz and Richard Bowden

2025 International Conference on Computer Vision (ICCV 2025)

Computer Vision Foundation / Institute of Electrical and Electronics Engineers (IEEE)

International Conference on Computer Vision (ICCV 2025) (Honolulu, Hawai'i, 19/10/2025–23/10/2025)

25/06/2025

DOI: https://doi.org/10.48550/arXiv.2503.08529

Abstract

Sign language representation learning presents unique challenges due to the complex spatio-temporal nature of signs and the scarcity of labelled datasets. Existing methods often rely either on models pre-trained on general visual tasks, that lack sign-specific features, or use complex multimodal and multi-branch architectures. To bridge this gap, we introduce a scalable, self-supervised framework for sign representation learning. We leverage important inductive (sign) priors during the training of our RGB model. To do this, we leverage simple but important cues based on skeletons while pretraining a masked autoencoder. These sign specific priors alongside feature regularization and an adversarial style agnostic loss provide a powerful backbone. Notably, our model does not require skeletal keypoints during inference, avoiding the limitations of keypoint-based models during downstream tasks. When finetuned, we achieve state-of-the-art performance for sign recognition on the WLASL, ASL-Citizen and NMFs-CSL datasets, using a simpler architecture and with only a single-modality. Beyond recognition, our frozen model excels in sign dictionary retrieval and sign translation, surpassing standard MAE pretraining and skeletal-based representations in retrieval. It also reduces computational costs for training existing sign translation models while maintaining strong performance on Phoenix2014T, CSL-Daily and How2Sign.

Files and links (3)

pdf

RyanWong_signrep13.03 MBDownload View

Author's Accepted Manuscript Open Access CC BY V4.0

url

https://iccv.thecvf.com/View

Event Website Conference website

url

https://openaccess.thecvf.com/content/ICCV2025/html/Wong_SignRep_Enhancing_Self-Supervised_Sign_Representations_ICCV_2025_paper.htmlView

Published (Version of record)

Metrics

604 File views/ downloads

36 Record Views

Details

Title: SignRep: Enhancing Self-Supervised Sign Representations
Creators: Ryan Cameron Wong (Author) - University of Surrey, School of Computer Science & Electronic Engineering
Necati Cihan Camgöz (Author) - Meta Reality Labs
Richard Bowden (Author) - University of Surrey, School of Computer Science & Electronic Engineering
Publication Details: 2025 International Conference on Computer Vision (ICCV 2025)
Conference: International Conference on Computer Vision (ICCV 2025) (Honolulu, Hawai'i, 19/10/2025–23/10/2025)
Publisher: Computer Vision Foundation / Institute of Electrical and Electronics Engineers (IEEE)
Number of pages: 17
Date accepted for publication: 25/06/2025
Grants: EASIER : Intelligent Automatic Sign Language Translation, 101016982, Horizon 2020
SMILE II, CRSII5 193686, Swiss National Science Foundation (Switzerland, Bern) - FNS
IICT Flagship, PFFS-21-47, Innosuisse – Swiss Innovation Agency (Switzerland, Bern)
Grant note: This work was supported by SNSF project ‘SMILE II’ (CRSII5 193686), European Union’s Horizon2020 programme (‘EASIER’ grant agreement 101016982) and the Innosuisse IICT Flagship (PFFS-21-47).
Identifiers: 991013164802346
Copyright: Copyright © 2025 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved. For the purpose of open access, the authors have applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.
Academic Unit: School of Computer Science & Electronic Engineering
Resource Type: Conference paper

SignRep: Enhancing Self-Supervised Sign Representations

Abstract

Files and links (3)

Metrics

Details

Usage Policy