









Two-Stage Peer-Regularized Feature Recombination for Arbitrary Image Style Transfer

NST has seen a tremendous growth within the deep learning community and spans a wide spectrum of applications e.g. converting time-of-day, mapping among artwork and photos, transferrring facial expressions, transforming animal species, etc.
The original formulation of Gatys et al. requires a new optimization process for each transfer performed, making it impractical for many real-world scenarios. In addition, the method relies heavily on pre-trained networks which are for classification tasks. The pre-trained networks have recently been shown to be baised toward texture rather than structure.
To overcome the first limitation, deep neural networks have been proposed to approximate the length optimization procedure in a single feed forward step thereby making the models amenable for realtime processing. Of notable mention in this regard are the works of Johnson et al. and Ulyanov et al.
When a nerual network is used to overcome the computation burden of Gatys, training of a model for every desired style image is required due to the limited-capacity of conventional models in encoding multiple styles into the weights of the network. This greatly narrows down the applicability of the method for use cases where the concept of style cannot be defined a-priori and needs to be inferred from examples. Recent works attempted to separate style and content in feature space (latent space) to allow generalization to a style characterized by an additional input image, or set of images. The modst widespread work in this family is AdaIN. The current state of the art allows to control the amount of stylization applied, interpolating between different styles, and using masks to convert different regions of image into different styles (AdaIN and Avatarnet).


