Generating Abstract Artwork with Python
Introduction
In this project we investigated, researched and created a portfolio of generative, abstract artwork using GANs (generative adversarial networks), a deep learning technique, with Python and PyTorch. We further utilized a range of different technologies such as MASSIVE and Weights & Biases to attain the best possible solution in producing high quality artwork. Additionally, all development was tracked with version control software, Git and Gitlab.
With no pre-existing knowledge on GANs, our project began with intensive research in gathering a core understanding in how GANs work, specifically how they correlate to generating images.
There are many great articles online discussing the fundamental knowledge of traditional GANs with generators, discriminators and with their corresponding loss functions, however, in this article we hope to elaborate further and provide insight into a unique perspective on new recently published GAN models; and how we incorporated a few of them into a few of our own models.
Specifically we will provide a high level overview discussing ESRGANs (Enhanced Super-Resolution Generative Adversarial Networks) and DCGANs (Deep Convolutional Generative Adversarial Networks) while also showing how we integrated CANs (Creative Adversarial Networks) into our own models.
GAN Models
DCGAN
DCGANs were come across in the initial stages of our research, which is an extension of a GAN, however it uses convolutional layers in the discriminator and convolution-transpose layers in the generator (Inkawhich, 2022). The idea is to use convolutional neural networks (CNNs) to ensure a stable architecture by upsampling and downsampling. It does not use max pooling or the method of fully connected layers.
In summary DCGAN works by (Hui, 2016):
Replacing all max pooling with strides
Use transposed convolution for upsampling
Eliminate fully connected layers
Batch normalisation except the output layer for the generator and the input layer for the discriminator
ReLU in the generator
LeakyReLU in the discriminator.
ESRGAN
After producing our abstract artwork, to increase the quality we utilized a pre-existing ESRGAN model created by Xintao Wang and his team created recently in 2018.
In essence, the ESRGAN model strives to achieve “consistently better visual quality” while also maintaining a “realistic and natural texture” (Wang et al., 2018, p. 1).
Most importantly to note, the model focuses solemnly on realism as it effectively asks “whether one image is more realistic than the other” rather than a typical GAN which would compare if “whether one image is real or fake” (Wang et al., 2018, p. 2).
To effectively create the ESRGAN, it is required to improve the adversarial and perpetual loss in the current SRGAN (Super-Resolution Generative Adversarial Networks) by using techniques such as relativisitc GAN (“predict relative realness instead of absolute”), residual scaling, and smaller initialization (Wang et al., 2018, p.4).
CAN
For a traditional GAN which looks at real and fake images, a CAN creates images based on maximizing deviation from style and minimizing deviation from art distribution.
It generates variability based on adjusting the parameters in order to deviate from the dataset’s style and design. Given this characteristic and functionality in the model, the generated images are limited to its creativity and variability (Elgammal et al., 2017).
Even though through this model, the artwork produced is not novel, we have adjusted parameters to produce simple abstract artwork.
Generator receives two signals from:
Signal 1: Discriminator’s classification of whether it is art or not (minimises). Images will emulate art.
Signal 2: How well the discriminator can classify generated art into established styles (maximises).
Discriminator is the most important model in training and learning inside this model as it accounts for both the “art distribution and art styles/classes (K)” (Elgammal et al., 2017, p. 9). This is determined through K-way loss.
Deep Learning Engineer Experiences
ALINA ERMOLAEVA
In terms of my model, it is built upon DCGANs, taking abstract images and resizes them to 64x64 pixels. Other sizes such as 128x128 pixels were also tested, however the training time for the model was increased immensely. Therefore, 64x64 pixels was the desired size to allow more results to be generated with other changing hyperparameters. Like any other model, this one implements a generator and discriminator, where my discriminator uses convolution neural networks (CNNs) as its basic building block. The generator converts the image into a tensor which is known as transposed convolution (using ConvTranspose2d from PyTorch).
The discriminator training relies on the binary cross entropy loss function that determines how well it differentiates between real and generated images, which ultimately compares each of the predicted probabilities to the actual and can take a value of either 0 or 1.
Both the generator and discriminator models were trained at the same time, where the Adam optimiser was used. This is a replacement optimisation algorithm which changes the weights and learning rates of the neural network in order to reduce the losses.
As the images are being generated, the model learns from its previous created image and the resolution of the next image increases.
Conclusion
We appreciate all the guidance and support from the entire MDN team. A huge thanks to Will Maclean, the Deep Learning Lead, for overshadowing the entire project throughout the semester and an honorable mention to Kamron for assisting, researching and supporting the team during the initial stages of the project.
REFERENCES
Elgammal, A. Liu, B. et al. (2017). CAN: Creative Adversarial Networks Generating “Art” by Learning About Styles and Deviating from Style Norms. https://arxiv.org/pdf/1706.07068.pdf
Hui, J. (2016). GAN — DCGAN (Deep convolutional generative adversarial networks). Retrieved from medium: https://jonathan-hui.medium.com/gan-dcgan-deep-convolutional-generative-adversarial-networks-df855c438f
Inkawhich, N. (2022). DCGAN TUTORIAL. Retrieved from PyTorch: https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html
Wang, X. Y, K. et al. (2018). ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. https://openaccess.thecvf.com/content_ECCVW_2018/papers/11133/Wang_ESRGAN_Enhanced_Super-Resolution_ Generative_Adversarial_Networks_ECCVW_2018_paper.pdf