image_classification

Image Classficiation

Mixed-precision training for ResNet50 v1.5 modified from DeepLearningExamples.

In this example, we use ActNN by manually constructing the model with the memory-saving layers.

Our training logs are available at Weights & Biases.

Requirements

Put the ImageNet dataset to ~/imagenet
Install required packages

pip install matplotlib tqdm

Install apex

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Train Pre-activation ResNet56 on CIFAR10

mkdir -p results/tmp
python3 main.py --dataset cifar10 --arch preact_resnet56 --epochs 200 --num-classes 10 \
  -j 0 --weight-decay 1e-4 --batch-size 128 --label-smoothing 0 \
  --lr 0.1 --momentum 0.9  --warmup 4 \
  -c quantize --ca=True --actnn-level L3 \
  --workspace results/tmp --gather-checkpoints  ~/data/cifar10

Train ResNet50 v1.5 on ImageNet (Full Precision)

./dist-train 1 0 127.0.0.1 1 resnet50 \
   "-c quantize --ca=True --actnn-level L3"\
   tmp ~/imagenet 256

Train ResNet50 v1.5 on ImageNet (Mixed Precision)

./dist-train 1 0 127.0.0.1 1 resnet50 \
   "--amp --dynamic-loss-scale -c quantize --ca=True --actnn-level L3"\
   tmp ~/imagenet 256

Check gradient variance

Download model checkpoints

wget https://ml.cs.tsinghua.edu.cn/~jianfei/static/results.tar.gz
tar xzvf results.tar.gz
mkdir results/tmp

Cifar 100

python3 main.py --dataset cifar10 --arch preact_resnet56 --epochs 200 --num-classes 100 -j 0 --weight-decay 1e-4 --batch-size 128 --label-smoothing 0 \
    -c quantize --ca=True --actnn-level L3 \
    --workspace results/tmp --evaluate --training-only \
    --resume results/cifar100/checkpoint-10.pth.tar --resume2 results/cifar100/checkpoint-10.pth.tar  ~/data/cifar100

quantize config	Overall Bias	Overall Var
-c quantize --ca=True --actnn-level L3	0.03929	0.07694

Name		Name	Last commit message	Last commit date
parent directory ..
image_classification		image_classification
img		img
Dockerfile		Dockerfile
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
dist-test		dist-test
dist-train		dist-train
exp_mem_speed.py		exp_mem_speed.py
main.py		main.py
multiproc.py		multiproc.py
quick_test.sh		quick_test.sh
run.sh		run.sh
test		test
test_cifar		test_cifar
train		train
train_cifar		train_cifar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image_classification

image_classification

README.md

Image Classficiation

Requirements

Train Pre-activation ResNet56 on CIFAR10

Train ResNet50 v1.5 on ImageNet (Full Precision)

Train ResNet50 v1.5 on ImageNet (Mixed Precision)

Check gradient variance

Cifar 100

Files

image_classification

Directory actions

More options

Directory actions

More options

Latest commit

History

image_classification

Folders and files

parent directory

README.md

Image Classficiation

Requirements

Train Pre-activation ResNet56 on CIFAR10

Train ResNet50 v1.5 on ImageNet (Full Precision)

Train ResNet50 v1.5 on ImageNet (Mixed Precision)

Check gradient variance

Cifar 100