Logo image
Beyond Surface Similarity: A Riemannian Hierarchical Ranking Framework for Sociological Concept Equivalence
Conference proceeding   Open access   Peer reviewed

Beyond Surface Similarity: A Riemannian Hierarchical Ranking Framework for Sociological Concept Equivalence

Zeqiang Wang, Wing Yan Li, Jon Johnson, Nishanth Ramakrishna Sastry and Suparna De
Proceedings of the 34th ACM International Conference on Information and Knowledge Management (CIKM ’25), pp.3220-3229
ACM International Conference on Information and Knowledge Management, 34th (Seoul, Republic of Korea, 10/11/2025–14/11/2025)
10/11/2025

Abstract

CCS Concepts @BULLET Computing methodologies → Natural language processing @BULLET Applied computing → Sociology @BULLET Information systems → Similarity measures Keywords Information Retrieval, Conceptual Comparison, Computational So- ciology
Vocabularies such as the European Language Social Science Thesaurus (ELSST) and the CLOSER ontology are the foundational taxonomies capturing core social science concepts that form the foundations of large-scale longitudinal social science surveys. However , standard text embeddings often fail to capture the complex hierarchical and relational structures of the sociological concepts, relying on surface similarity. In this work, we propose a framework to model these nuances by adapting a large language model based text embedding model with a learnable diagonal Riemannian metric. This metric allows for a flexible geometry where dimensions can be scaled to reflect semantic importance. Additionally, we introduce a Hierarchical Ranking Loss with dynamic margins as the sole training objective to enforce the multi-level hierarchical constraints (e.g., distinguishing 'self' from narrower, broader, or related concepts, and all from 'unrelated' ones) from ELSST within the Riemannian space, such as ensuring a specific concept like 'social stratification' is correctly positioned by, for instance, being embedded closer to 'social inequality' (as its broader, related concept) and substantially further from an 'unrelated' concept like 'particle physics'. Lastly, we show that our parameter-efficient approach significantly out-performs strong contrastive learning and hyperbolic embedding baselines on hierarchical concept retrieval and classification tasks using the ELSST and CLOSER datasets. Visualizations confirm the learned embedding space exhibits a clear hierarchical structure. Our work offers a more accurate and geometrically informed method for representing complex sociological constructs.
pdf
3746252.37610292.49 MBDownloadView
Author's Accepted Manuscript Open Access

Metrics

5 File views/ downloads
26 Record Views

Details

Logo image

Usage Policy