PyTorch Implementation of CIFAR-10 Image Classification Pipeline Using VGG Like Network

We present here our solution to the famous machine learning problem of image classification with CIFAR-10 dataset with 60000 labeled images. The aim is to learn and assign a category for these 32x32 pixel images.

Dataset

The CIFAR-10 dataset, as it is provided, consists of 5 batches of training images which sum up to 50000 and a batch of 10000 test images.

Each test batch consists of exactly 1000 randomly-selected images from each class. The training batches contain images in random order, some training batches having more images from one class than another. Together, the training batches contain exactly 5000 images from each class.

Here we have used for training and validation purposes only the 50000 images originally meant for training. Stratified K-Folds cross-validation is used to split the data so that the percentage of samples for each class is preserved. Several other reported implementations use the data as it is given and use the given 10000 sample testing set straight for validation. Instead we use the 10000 sample test set for evaluating our trained model.

Model

We have made a PyTorch implementation of Sergey Zagoruyko VGG like network with BatchNormalization and Dropout for the task.

DataParallel(
  (module): VGGBNDrop(
    (features): Sequential(
      (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): ReLU(inplace)
      (3): Dropout(p=0.3)
      (4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (5): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (6): ReLU(inplace)
      (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
      (8): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (9): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (10): ReLU(inplace)
      (11): Dropout(p=0.4)
      (12): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (13): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (14): ReLU(inplace)
      (15): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
      (16): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (17): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (18): ReLU(inplace)
      (19): Dropout(p=0.4)
      (20): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (21): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (22): ReLU(inplace)
      (23): Dropout(p=0.4)
      (24): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (25): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (26): ReLU(inplace)
      (27): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
      (28): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (29): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (30): ReLU(inplace)
      (31): Dropout(p=0.4)
      (32): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (33): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (34): ReLU(inplace)
      (35): Dropout(p=0.4)
      (36): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (37): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (38): ReLU(inplace)
      (39): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
      (40): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (41): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (42): ReLU(inplace)
      (43): Dropout(p=0.4)
      (44): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (45): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (46): ReLU(inplace)
      (47): Dropout(p=0.4)
      (48): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (49): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (50): ReLU(inplace)
      (51): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
    )
    (classifier): Sequential(
      (0): Dropout(p=0.5)
      (1): Linear(in_features=512, out_features=512, bias=True)
      (2): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (3): ReLU(inplace)
      (4): Dropout(p=0.5)
      (5): Linear(in_features=512, out_features=10, bias=True)
    )
  )
)

Data Augmentations

In this implementation we only use horizontal flips. We pad the images into size 34x34 using reflective padding and then crop the images back into size 32x32. Random cropping is used as an augmentation in the training and then center cropping in the validation phase. Moreover, solt is used for the data augmentations.

In their experiments, Sergey Zagoruyko and Nikos Komodakis seem to have used whitened data. We use here the original data.

YUV color space was proposed to be used by Sergey Zagoruyko. We have run our experimets without the RGB to YUV conversion.

Data is normalized in the usual way with mean and standard deviation calculated across the 50000 images, as it can, e.g., speed up the training.

Setting up the data for training

From PyCharm Terminal

$ python build_dataset.py --dataset CIFAR10

Training

From PyCharm Terminal

$ python run_training.py --dataset_name CIFAR10 --num_classes 10 --experiment vggbndrop --bs 128 --optimizer sgd --lr 0.1 --lr_drop "[160, 260]" --n_epochs 300 --wd 5e-4 --learning_rate_decay 0.2 --n_threads 12 --color_space rgb --set_nesterov True

Results for CIFAR-10

Here we provide the results related to the VGGBNDrop model proposed by Sergey Zagoruyko using SGD as optimizer.

Training and validation

As can be seen from the curves representing loss over time, the model starts to overfit around epoch 164.

From the confusion matrices below related to the validation accuracy curve, we can see how the learning progresses.

Epoch 40:

Epoch 80:

Epoch 120:

Epoch 160:

Evaluation

Evaluation has been run using the model for which the validation loss was the best (see session for details).

Acknowledgements

Aleksei Tiulpin is acknowledged for kindly providing access to his pipeline scripts and giving his permission to reproduce and modify his pipeline for this task.

Research Unit of Medical Imaging, Physics and Technology is acknowledged for making it possible to run the experiments.

Authors

Antti Isosalo, University of Oulu, 2018-

References

Model Architecture

Zagoruyko, Sergey, and Nikos Komodakis. "Wide Residual Networks." Proceedings of the British Machine Vision Conference (BMVC), 2016.
Zagoruyko, Sergey. "92.45% on CIFAR-10." 2015

Data Augmentation

Tiulpin, Aleksei, "Streaming Over Lightweight Data Transformations." Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Finalnd, 2018.

Dataset

Krizhevsky, Alex, and Geoffrey Hinton. "Learning multiple layers of features from tiny images." Vol. 1. No. 4. Technical Report, University of Toronto, 2009.
Benenson, Rodrigo. "Are we there yet." 2016.
Recht, Benjamin, Roelofs, Rebecca, Schmidt, Ludwig, and Shankar, Vaishaal. "Do CIFAR-10 Classifiers Generalize to CIFAR-10?." arXiv preprint arXiv:1806.00451, 2018.

Name		Name	Last commit message	Last commit date
Latest commit History 172 Commits
data		data
imageclassification		imageclassification
logs		logs
metadata		metadata
plots		plots
snapshots		snapshots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build_dataset.py		build_dataset.py
create_conda_env.bsh		create_conda_env.bsh
requirements.txt		requirements.txt
run_evaluation.py		run_evaluation.py
run_training.py		run_training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyTorch Implementation of CIFAR-10 Image Classification Pipeline Using VGG Like Network

Dataset

Model

Data Augmentations

Setting up the data for training

Training

Results for CIFAR-10

Training and validation

Evaluation

Acknowledgements

Authors

References

Model Architecture

Data Augmentation

Dataset

About

Releases

Packages

Languages

License

aisosalo/CIFAR-10

Folders and files

Latest commit

History

Repository files navigation

PyTorch Implementation of CIFAR-10 Image Classification Pipeline Using VGG Like Network

Dataset

Model

Data Augmentations

Setting up the data for training

Training

Results for CIFAR-10

Training and validation

Evaluation

Acknowledgements

Authors

References

Model Architecture

Data Augmentation

Dataset

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages