Abstract Latent space-based facial attribute editing methods have gained popularity in applications such as digital entertainment, virtual avatar creation, and human-computer interaction systems due to their potential for efficient and flexible attribute manipulation, particularly for continuous edits. Among these, unsupervised latent space-based methods, which discover effective semantic vectors without relying on labeled data, have attracted considerable attention in the research community. However, existing methods still encounter difficulties in disentanglement, as manipulating a specific facial attribute may unintentionally affect other attributes, complicating fine-grained controllability. To address these challenges, we propose a novel framework designed to offer an effective and adaptable solution for unsupervised facial attribute editing, called Unsupervised Facial Attribute Controllable Editing (U-Face). The proposed method frames semantic vector learning as a subspace learning problem, where latent vectors are approximated within a lower-dimensional semantic subspace spanned by a semantic vector matrix. This formulation can also be equivalently interpreted from a projection-reconstruction perspective and further generalized into an autoencoder framework, providing a foundation that can support disentangled representation learning in a flexible manner. To improve disentanglement and controllability, we impose orthogonal non-negative constraints on the semantic vectors and incorporate attribute boundary vectors to reduce entanglement in the learned directions. Although these constraints make the optimization problem challenging, we design an alternating iterative algorithm, called Alternating Iterative Disentanglement and Controllability (AIDC), with closed-form updates and provable convergence under specific conditions. Extensive experiments on multiple pre-trained GAN generators demonstrate that U-Face shows improved disentanglement, controllability, and visual fidelity compared to existing state-of-the-art supervised and unsupervised baselines. Specifically, U-Face reduces inter-attribute correlation by an average of 5%-15%, and improves FID by 10%-20% and LPIPS by 5%-10%. Compared to recent diffusion-based methods (DDS, CDS, FPE), U-Face achieves comparable editing quality and reduces inference times, making it suitable for real-time applications across different GAN backbones and enabling large-scale interactive facial editing.
Liu et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: