What's the Point? Spatial Grammar & Index Resolution for Sign Language Processing

Oline Ranum; Simon Hadfield; Richard Bowden

doi:10.48550/arxiv.2606.08056

Back

Preprint

What's the Point? Spatial Grammar & Index Resolution for Sign Language Processing

Oline Ranum, Simon Hadfield and Richard Bowden

06/06/2026

DOI: https://doi.org/10.48550/arxiv.2606.08056

Abstract

Computer Science - Artificial Intelligence

Computer Science - Computation and Language

Sign language models are predominantly trained with gloss-sequence or text supervision, thereby under-modeling non-lexical and productive constructions. One comparatively tractable instance is spatial indexing: pointing gestures that assign discourse entities to spatial loci for subsequent co-reference, which lexicon-centric objectives largely fail to capture. We present a targeted evaluation of indexing in Sign Language Recognition, showing that despite comprising 10-15% of signing content, indexing is poorly recovered. We introduce a framework for training and evaluating indexing experts, establishing a baseline for index-aware sign language modeling. Our approach decomposes spatial reference resolution into index detection and discourse entity linking. The resulting mention representations enable automatic annotation and non-lexical structure modeling, and serve as an auxiliary indexing expert that augments a frozen SLR model at inference time.

Metrics

2 Record Views

Details

Title: What's the Point? Spatial Grammar & Index Resolution for Sign Language Processing
Creators: Oline Ranum
Simon Hadfield
Richard Bowden
Identifiers: 991138082902346
Academic Unit: School of Computer Science & Electronic Engineering
Language: English
Resource Type: Preprint

What's the Point? Spatial Grammar & Index Resolution for Sign Language Processing

Abstract

Metrics

Details

Usage Policy