Abstract
How Large Multilingual Models (LMMs) separate linguistic form from semantic content is largely opaque. This paper introduces a framework to dissect their internal representations, revealing a phenomenon we term Functional Specialization: the emergence of distinct neural circuits for language-specific form versus language-agnostic semantics. Through extensive experiments on the E5 and Qwen series, we show this specialization is governed by three factors. Architecture dictates the core strategy: encoders adopt specialization breadth (many features in a staged workflow), while decoders pursue specialization depth (few, high-purity features). Scale primarily drives neural efficiency, enabling robust separation with fewer circuits. Finally, high-clarity data acts as a catalyst, inducing sophisticated mechanisms even in smaller models. Our findings chart a path toward more controllable and interpretable multilingual AI.