Abstract
This paper presents a method for sound field interpolation/extrapolation from a spatially sparse set of binaural room impulse responses (BRIRs). The method focuses on the direct component and early reflections, and is framed as an inverse problem seeking the weight signals of an acoustic model based on the time-domain equivalent source (TES). Once the weight signals are estimated, the (continuous) sound field can be reconstructed and BRIRs can be synthesised at any position and orientation in a source-free volume bounded by the TESs. The L1-norm, sum of L2-norm, and Tikhonov regularisation functions were tested, with L1-norm (imposing spatio-temporal sparsity) performing the best. Simulations exhibit lower normalised mean squared error (NMSE) compared to a nearest-neighbour approach, which uses the spatially closest BRIR measurement for rendering. Results show good temporal alignment of direct sound and reflections, even when a non-individualised head-related impulse response (HRIR) was used for system inversion and BRIR synthesis. The performance is also assessed using an objective measure of perceived coloration called the predicted binaural coloration (PBC) model, which reveals a good perceptual match between interpolated/extrapolated and true BRIRs.