Interpolating Room Impulse Responses (RIRs) at unmeasured locations within a space is a fundamental challenge in room acoustics, critical for applications such as volumetric noise cancellation, auralization, and spatial audio rendering. Traditional methods including linear interpolation in the time or frequency domains and basis decomposition techniques such as plane wave and spherical harmonic decomposition often fail to capture the non-linear variations in arrival times and energy decay caused by complex propagation paths and occlusions. In this work, we propose a neural network that manipulates the time shifts between RIRs using receiver coordinates as inputs. These time axis manipulations capture time domain misalignments across spatial locations, enabling the network to model spatially varying acoustic delays and reflections. By aligning and blending neighboring RIRs using the modeled time shifts, we interpolate RIRs at unseen positions with higher temporal and spectral fidelity. This approach focuses particularly on preserving early reflection structure and perceptual similarity while offering a data-driven alternative for accurate RIR field reconstruction in complex acoustic environments.
Fernando et al. (Wed,) studied this question.