A pytorch implementation of UNetV3Plus.
UNetV3Plus is originally designed for medical image segmentation, I modify it to use custom encoder such as resnet and support multi-label segmentation, here is the result on Pascal VOC2012:
Training: 512x512 random crop
validation: 512x512 center crop
Model | Batch Size | mIoU |
---|---|---|
UNetV3Plus-ResNet34 | 16*4 | 0.739 |
Download from Google Drive
16*4 means batch size 16 and 4 gradient accumulation steps.
Run python train.py --cfg config/resnet34_voc.yaml
to reproduce the result.
Although Multi-label MS-SSIM loss is implemented but not used in the training, modify the 'loss_type' in the config file to 'u3p' to use it.
The config file of the original model in the official paper is config/original_voc.yaml.
Please reference config/config.py for more info about model arch or training settings. Custom Dataset is not supported yet.
Use tensorboard or wandb to log training metrics.
Download VOC2012 and trainaug,
Extract trainaug labels (SegmentationClassAug) to the VOC2012 directory.
More info about trainaug can be found in DeepLabV3Plus.
/data
/VOCdevkit
/VOC2012
/SegmentationClass
/SegmentationClassAug # <= the trainaug labels
2007_000032.png
...
/JPEGImages
...
...
/VOCtrainval_11-May-2012.tar
...
- UNet 3 : A Full-Scale Connected UNet for Medical Image Segmentation
- VOC2012 data-pipeline and eval-metrics are modified from DeepLabV3Plus