Key points are not available for this paper at this time.
Image-generating machine learning models are typically trained with loss based on distance in the image space. This often leads to-smoothed results. We propose a class of loss functions, which we call deep similarity metrics (DeePSiM), that mitigate this problem. Instead of distances in the image space, we compute distances between image extracted by deep neural networks. This metric better reflects similarity of images and thus leads to better results. We show applications: autoencoder training, a modification of a variational, and inversion of deep convolutional networks. In all cases, the images look sharp and resemble natural images.
Dosovitskiy et al. (Mon,) studied this question.