Spotter+GPT: Turning Sign Spottings into Sentences with LLMs

Ozge Mercanoglu Sincan; Richard Bowden

doi:10.1145/3742886.3756708

Back

Spotter+GPT: Turning Sign Spottings into Sentences with LLMs

Conference proceeding

Open access

Peer reviewed

Spotter+GPT: Turning Sign Spottings into Sentences with LLMs

Ozge Mercanoglu Sincan and Richard Bowden

IVA Adjunct '25: Adjunct Proceedings of the 25th ACM International Conference on Intelligent Virtual Agents, Vol.In Press(In Press)

IVA: Intelligent Virtual Agents

IVA Adjunct ’25

25th ACM International Conference on Intelligent Virtual Agents (Berlin, Germany, 16/09/2025–19/09/2025)

30/09/2025

DOI: https://doi.org/10.1145/3742886.3756708

Abstract

CCS Concepts

Keywords

Sign Spotting

Sign Language Translation

Real-time

ChatGPT

Human-centered computing

Accessibility technologies

Sign Language Translation (SLT) is a challenging task that aims to generate spoken language sentences from sign language videos. In this paper, we introduce a lightweight, modular SLT framework, Spotter+GPT, that leverages the power of Large Language Models (LLMs) and avoids heavy end-to-end training. Spotter+GPT breaks down the SLT task into two distinct stages. First, a sign spotter identifies individual signs within the input video. The spotted signs are then passed to an LLM, which transforms them into meaningful spoken language sentences. Spotter+GPT eliminates the requirement for SLT-specific training. This significantly reduces computational costs and time requirements. The source code and pretrained weights of the Spotter are available online.

Files and links (2)

pdf

Spotter_GPT829.71 kBDownload View

Author's Accepted Manuscript CC BY V4.0, Open Access

url

https://iva.acm.org/2025/View

Event WebsiteConference website

Metrics

35 Record Views

Details

Title: Spotter+GPT: Turning Sign Spottings into Sentences with LLMs
Creators: Ozge Mercanoglu Sincan (Author) - University of Surrey, School of Computer Science & Electronic Engineering
Richard Bowden (Author) - University of Surrey, School of Computer Science & Electronic Engineering
Publication Details: IVA Adjunct '25: Adjunct Proceedings of the 25th ACM International Conference on Intelligent Virtual Agents, Vol.In Press(In Press)
Conference: 25th ACM International Conference on Intelligent Virtual Agents (Berlin, Germany, 16/09/2025–19/09/2025)
Event: IVA Adjunct ’25
Series: IVA: Intelligent Virtual Agents
Publisher: Association for Computing Machinery (ACM)
First online publication date: 30/09/2025
Date accepted for publication: 07/07/2025
Grants: SMILE II, CRSII5 193686, Swiss National Science Foundation (Switzerland, Bern) - FNS
IICT Flagship, PFFS-21-47, Innosuisse – Swiss Innovation Agency (Switzerland, Bern)
SignGPT-EP/Z535370/1, APP24554, UK Research and Innovation (United Kingdom, Swindon) - UKRI
AI for Global Goals Scheme, RB3208, Google (United Kingdom) (United Kingdom, London)
Grant note: This work was supported by the SNSF project ‘SMILE II’ (CRSII5 193686), the Innosuisse IICT Flagship (PFFS-21-47), EPSRC grant APP24554 (SignGPT-EP/Z535370/1), and through funding from Google.org via the AI for Global Goals scheme.
Identifiers: 991016666302346
Copyright: © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM. For the purpose of open access, the authors have applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.
Academic Unit: School of Computer Science & Electronic Engineering
Language: English
Resource Type: Conference proceeding

Spotter+GPT: Turning Sign Spottings into Sentences with LLMs

Abstract

Files and links (2)

Metrics

Details

Usage Policy