Logo image
What's the Point? Spatial Grammar & Index Resolution for Sign Language Processing
Preprint

What's the Point? Spatial Grammar & Index Resolution for Sign Language Processing

06/06/2026

Abstract

Computer Science - Artificial Intelligence Computer Science - Computation and Language
Sign language models are predominantly trained with gloss-sequence or text supervision, thereby under-modeling non-lexical and productive constructions. One comparatively tractable instance is spatial indexing: pointing gestures that assign discourse entities to spatial loci for subsequent co-reference, which lexicon-centric objectives largely fail to capture. We present a targeted evaluation of indexing in Sign Language Recognition, showing that despite comprising 10-15% of signing content, indexing is poorly recovered. We introduce a framework for training and evaluating indexing experts, establishing a baseline for index-aware sign language modeling. Our approach decomposes spatial reference resolution into index detection and discourse entity linking. The resulting mention representations enable automatic annotation and non-lexical structure modeling, and serve as an auxiliary indexing expert that augments a frozen SLR model at inference time.

Metrics

2 Record Views

Details

Logo image

Usage Policy