What question did this study set out to answer?

The goal is to analyze and enhance implicit neural representations (INRs) for effective self-supervised image denoising without paired data.

February 5, 2026Open Access

Enhancing implicit neural representations for zero-shot image denoising: Depth, structured sparsity, and ensemble modeling

Key Points

The goal is to analyze and enhance implicit neural representations (INRs) for effective self-supervised image denoising without paired data.
Systematic analysis of existing INR limitations and their applications in denoising.
Incorporation of architectural modifications such as Sine activation layers and Fourier features.
Investigation of deeper, narrower network topologies for improved performance.
Implementation of structured sparsity to minimize overfitting during training.
Development of a bagging-based ensemble of sparse INRs to enhance reconstruction quality.
Enhanced stability and performance of INRs in zero-shot image denoising settings.
Significantly reduced overfitting leading to improved denoising outcomes.
Higher Peak Signal-to-Noise Ratio (PSNR) achieved with the ensemble approach compared to individual INR models.

Abstract

Implicit Neural Representations (INRs) have emerged as a highly active and influential research direction for modeling signals such as images, audio, and 3D scenes. Their ability to represent coordinate-based functions using Multi-Layer Perceptrons (MLP) makes them a compelling framework for various reconstruction tasks. A natural question that arises is whether INRs can also be leveraged for self-supervised image denoising, where only a single noisy image is available and no paired clean-noisy data set exists. This setting is particularly relevant to many practical applications in biomedicine and astronomy, where training data is scarce, and clean samples are often infeasible to obtain. However, directly employing INRs for denoising is challenging and known to suffer from severe overfitting. Owing to their high capacity, INRs can easily fit both the underlying signal and the noise, undermining their denoising performance. In this thesis, we systematically analyze the structure and limitations of INRs for self-supervised (zero-shot) denoising and introduce several enhancements to improve their performance and stability. We integrate architectural modifications such as Sine activation layers and Fourier feature-based inputs to enhance their ability to capture high-frequency image structures. We also investigate network topology and show that deeper, narrower architectures outperform standard INR configurations. To mitigate overfitting during optimization, we introduce structured sparsity, which regularizes the model and stabilizes convergence. Finally, we propose a bagging-based ensemble of sparse INRs, where independently trained models are aggregated to reduce variance and improve reconstruction quality. Together, these contributions form a stable and effective INR-based framework for self-supervised image denoising, achieving higher PSNR compared to individual INR models.

Enhancing implicit neural representations for zero-shot image denoising: Depth, structured sparsity, and ensemble modeling

Key Points

Abstract

Cite This Study