Logo image
Decomposing Multilingual Representations: How Scale, Architecture, and Data Shape Functional Specialization
Conference proceeding   Open access   Peer reviewed

Decomposing Multilingual Representations: How Scale, Architecture, and Data Shape Functional Specialization

Zeqiang Wang, Xinyue Wu, Chenxi Li, Yuqi Wang, Zixi Chen, Jon Johnson and Suparna De
ICASSP 2026 - 2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),, pp.17487-17491
2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Barcelona, Spain, 03/05/2026–08/05/2026)
21/04/2026

Abstract

Neural circuits Multilingual Representation Learning Interpretability

How Large Multilingual Models (LMMs) separate linguistic form from semantic content is largely opaque. This paper introduces a framework to dissect their internal representations, revealing a phenomenon we term Functional Specialization: the emergence of distinct neural circuits for language-specific form versus language-agnostic semantics. Through extensive experiments on the E5 and Qwen series, we show this specialization is governed by three factors. Architecture dictates the core strategy: encoders adopt specialization breadth (many features in a staged workflow), while decoders pursue specialization depth (few, high-purity features). Scale primarily drives neural efficiency, enabling robust separation with fewer circuits. Finally, high-clarity data acts as a catalyst, inducing sophisticated mechanisms even in smaller models. Our findings chart a path toward more controllable and interpretable multilingual AI.

pdf
20250907115518_848975_27378.99 MBDownloadView
Author's Accepted Manuscript Open Access CC BY V4.0

Metrics

1 Record Views

Details

Logo image

Usage Policy