Logo image
Spotter+GPT: Turning Sign Spottings into Sentences with LLMs
Conference proceeding   Open access   Peer reviewed

Spotter+GPT: Turning Sign Spottings into Sentences with LLMs

Ozge Mercanoglu Sincan and Richard Bowden
IVA Adjunct '25: Adjunct Proceedings of the 25th ACM International Conference on Intelligent Virtual Agents, Vol.In Press(In Press)
IVA: Intelligent Virtual Agents
IVA Adjunct ’25
25th ACM International Conference on Intelligent Virtual Agents (Berlin, Germany, 16/09/2025–19/09/2025)
30/09/2025

Abstract

CCS Concepts Keywords Sign Spotting Sign Language Translation Real-time ChatGPT Human-centered computing Accessibility technologies

Sign Language Translation (SLT) is a challenging task that aims to generate spoken language sentences from sign language videos. In this paper, we introduce a lightweight, modular SLT framework, Spotter+GPT, that leverages the power of Large Language Models (LLMs) and avoids heavy end-to-end training. Spotter+GPT breaks down the SLT task into two distinct stages. First, a sign spotter identifies individual signs within the input video. The spotted signs are then passed to an LLM, which transforms them into meaningful spoken language sentences. Spotter+GPT eliminates the requirement for SLT-specific training. This significantly reduces computational costs and time requirements. The source code and pretrained weights of the Spotter are available online.

pdf
Spotter_GPT829.71 kBDownloadView
Author's Accepted Manuscript CC BY V4.0 Open Access
url
https://iva.acm.org/2025/View
Event WebsiteConference website

Metrics

35 Record Views

Details

Logo image

Usage Policy