Logo image
Robust multimodal crowd counting with modality reconfigurability
Journal article   Peer reviewed

Robust multimodal crowd counting with modality reconfigurability

Penghui Shao, Chaoqun Ma, Yi Zheng, Ferrante Neri, Yang Meng, Anyong Qing and Yang Wang
Integrated computer-aided engineering
23/03/2026

Abstract

Computer Science Computer Science, Artificial Intelligence Computer Science, Interdisciplinary Applications Engineering Engineering, Multidisciplinary Science & Technology Technology
Crowd counting aims to estimate the number of individuals in images, and the use of multimodal data has been shown to significantly enhance counting accuracy. However, such approaches are highly sensitive to the loss or corruption of data from any single modality, leading to severe performance degradation. To address this limitation, a new problem setting-Modality-Reconfigurable Crowd Counting-is introduced, in which a model is required to maintain robust performance even when one of the input modalities (e.g., RGB or thermal) is perturbed or entirely unavailable. Modality reconfigurability is achieved through effective cross-modal information transfer, enabled by a Feature Patches Generator that leverages Margin Ranking Loss across multiple network layers to align and transfer discriminative features between modalities. Additionally, a Negative Knowledge Transfer Prevention module is incorporated to suppress misleading or detrimental cross-modal signals. State-of-the-art performance is demonstrated on RGB-T crowd counting benchmarks, with consistent accuracy maintained under both complete and degraded modality conditions.
url
https://doi.org/10.1177/10692509261433298View
Published (Version of record) Open

Metrics

1 Record Views

Details

Logo image

Usage Policy