Skip to content

DCGAN (Deep Convolutional Generative Adversarial Network) custom architecture builder and image synthesizer to specify the architecture of the generator and discriminator, visualize the models, train the GAN, synthesize images, and analyze synthetic imagery losslessly visualized.

License

Notifications You must be signed in to change notification settings

AvaAvarai/DCGAN_Custom_Architecture_Builder_and_Image_Synthesizer

Repository files navigation

DCGAN Custom Architecture Builder and Synthetic Image Generator

This project is a DCGAN (Deep Convolutional Generative Adversarial Network) custom architecture builder and image synthesizer. It allows the user to specify the architecture of the generator and discriminator, visualize the models, train the GAN, and synthesize images. This allows for dynamic experimentation with the architecture of the generator and discriminator because as detailed in the literature [1] and [2] the architecture of the generator and discriminator impacts the performance of the GAN.

The user interface is built in Python using Tkinter, and the models are built using TensorFlow and Keras the diagrams are visualized with visualkeras and tensorflow keras utils.

ui screenshot

Lossless Visualization of the Imagery Data

With the motivation of visualizing imagery data as Parallel Coordinates plots and taking inspiration from snake view we visualize images as their pixel-frequency data matrices in multi-row Parallel Coordinates forming a tableau view.

Multi-Row Parallel Coordinates Grayscale Image Frequency Tableau Plot

We load image(s) as say 'png', 'jpg', 'jpeg', or 'bmp' by the grayscale frequency of each pixel in the images into a data matrix D we then min-max normalize D to yield D’. We visualize D’ by a snake-view inspired, imagery domain-specific task, since image pixels have a natural order of precedence we mimic the order with our parallel axes think where we have the number of rows of parallel axes as the number of rows in D’ and the same for the number of columns where each column is a parallel axis and each row is a parallel coordinates plot all sharing the same minimal and maximal axis values forming a Parallel Coordinates Image Frequency Tableau Plot.

Lossless visualization of the MNIST train data subsets. This copy of the plots losses some fidelity due to shrinking the image sizes to see simultaneously. All original plots are in the lossless_visualization_train_subsets folder.
mnist visualization

Project Setup

Currently, the project is a single python file, and the dependencies are:

pip install numpy matplotlib tensorflow keras visualkeras pillow pydot

Project Execution

python main.py

Ground Truth MNIST Sevens

We will be using the MNIST dataset of handwritten digits, for training open character recognition models. We specifically use the sevens for initial experimentation.

Ten of the 4401 MNIST sevens train data images are shown below.

mnist seven 0 mnist seven 1 mnist seven 2 mnist seven 3 mnist seven 4 mnist seven 5 mnist seven 6 mnist seven 7 mnist seven 8 mnist seven 9

First Experiment

We trained the custom DCGAN model on the MNIST 7's train data for 10 epochs with loss values of batch=1300, d_loss=1.2257, g_loss=0.9160, and generated 5 images.

Generated Image 1 Generated Image 2 Generated Image 3 Generated Image 4 Generated Image 5

Second Experiment

We trained the custom DCGAN model on the MNIST 7's train data for 30 epochs with loss values of batch=4100, d_loss=0.9101, g_loss=1.2164, and generated 6 images.

This is the first experiment with custom architecture parameters.

Experiment architecture parameters:
Training data: MNIST train set digit sevens.
Epochs: 30
Latent Dim: 100
Generator: 1024,4,1; 512,5,2; 256,5,2; 128,5,2; 3,5,2
Discriminator: 64,4,2; 128,4,2; 256,4,2; 512,4,2
Resultant loss values: batch=4100, d_loss=0.9101, g_loss=1.2164

Generated Image 0 Generated Image 1 Generated Image 2 Generated Image 3 Generated Image 4 Generated Image 5

Architecture Block Diagrams for the Second Experiment

Second Experiment Generator Block Diagram

Generator Architecture

Second Experiment Discriminator Block Diagram

Discriminator Architecture

Third Experiment

We trained the custom DCGAN model on the MNIST 7's train data for 10 epochs with loss values of batch=1300, d_loss=1.3237, g_loss=0.7212, and generated 10 images.

Using the following architecture parameters: Generator: 1024,4,1; 512,4,2; 256,4,2; 128,4,2; 3,4,2 Discriminator: 64,4,2; 128,4,2; 256,4,2; 512,4,2

This is the first experiment with dynamic starting spatial dimensions allowing for this architecture to be used for any image size.

Generated Image 0 Generated Image 1 Generated Image 2 Generated Image 3 Generated Image 4 Generated Image 5 Generated Image 6 Generated Image 7 Generated Image 8 Generated Image 9

Architecture Block Diagrams for the Third Experiment

Third Experiment Generator Block Diagram

Generator Architecture

Third Experiment Discriminator Block Diagram

Discriminator Architecture

Fourth Experiment

This is the first experiment with grayscale image input, previously the images were processed as RGB.

2 epochs of training with the following loss values: Not grayscale batch=200, d_loss=1.3812, g_loss=0.7681 Grayscale batch=200, d_loss=1.3975, g_loss=0.7304

Architecture parameters: Generator: 1024,4,1; 512,4,2; 256,4,2; 128,4,2; 3,4,2 Discriminator: 64,4,2; 128,4,2; 256,4,2; 512,4,2

Latent dimension: 100 Generated images: 12

Processed as RGB:
Generated Image 0 Generated Image 1 Generated Image 2 Generated Image 3 Generated Image 4 Generated Image 5 Generated Image 6 Generated Image 7 Generated Image 8 Generated Image 9 Generated Image 10 Generated Image 11

Processed as grayscale:
Generated Image 0 Generated Image 1 Generated Image 2 Generated Image 3 Generated Image 4 Generated Image 5 Generated Image 6 Generated Image 7 Generated Image 8 Generated Image 9 Generated Image 10 Generated Image 11

Todo

  • Add graph visualization of the generator and discriminator loss values over training epochs.

Referenced Citations

[1] Radford, A. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.

[2] S. Vijaya Lakshmi, Vallik Sai Ganesh Raju Ganaraju, “Deep Convolutional Generative Adversial Network on MNIST Dataset”, Journal of Science and Technology, Vol. 06, Issue 03, May-June 2021, pp169-177

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

DCGAN (Deep Convolutional Generative Adversarial Network) custom architecture builder and image synthesizer to specify the architecture of the generator and discriminator, visualize the models, train the GAN, synthesize images, and analyze synthetic imagery losslessly visualized.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages