This repository attempts to perform lung segmentation on chest x-ray images using purely image preprocessing and deep learning approaches like U-Net. The segmentation of lung regions can be later used to perform abnomaly detection on chest x-ray regions
This repository makes use of the preprocessing pipeline suggested by the "Lung boundary detection for chest X-ray images classification based on GLCM and probabilistic neural networks" paper (Link).
- The testing script can be ran using the test.py file:
python3 test.py --input <path_to_chest_xray_img_file>
The dataset to train U-net in this repository is taken from this github repository.
To initiate training of U-Net for segmentation, go into unet directory and run train.py in the following way :
python3 train.py --data <data_dir>
--epochs <number_training_iter>
--batch_size <batch_size>
--val-ratio <ratio_of_validation_set>
--lr <learning_rate>
--save-path <checkpoint_dir>
--log-dir <log_dir>
Where :
- "data_dir" is the data directory containing two sub-directories "images" and "masks".
- "save_path" is the checkpoint directory in which two files will be produced :
- model.weights.hdf5 : model weights file.
- model.h5 : model structure
- "log_dir" is the directory at which training info is store"
After successful run of the training script, the terminal should appear as followed :
To view training info (losses, accuracies) in real time. Run :
python3 dashboard --log-dir <log_dir>/lung-segmentation
The dashboard will be updated every epoch :
To test the trained model, import the UnetSegmenter class from unet module. The constructor takes in two parameters : the weights file and the model file of the Unet model:
import cv2
from unet import UnetSegmenter
segmenter = UnetSegmenter('checkpoints/model.h5', 'checkpoints/model.weights.hdf5')
img = cv2.imread('images/test1.png')
img = cv2.resize(img, (256, 256))
segmenter.visualize_prediction(img)
The semi-supervised training method for U-Net is inspired by the Pi model mentioned in the "Temporal Ensembling for Semi-Supervised Learning" paper (Link). For both labelled and unlabelled data batches, the images will be weakly and strongly augmented. The weak and strong augmentations will be fed into the network for segmentation mask prediction. The predictions of both streams will then be used to calculate the consistency loss (mean squared error). The motivation is to make the network generalize and be indifferent to minor changes in the image data.
To start training U-Net for lung segmentation in a semi-supervised manner:
cd semisupervised_segmentation
python3 train.py --data <path_to_labelled_data_dir> --u-data <path_to_unlabelled_data_dir>
Where :
- "path_to_labelled_data_dir" is the data directory containing two sub-directories "images" and "masks".
- "path_to_unlabelled_data_dir" is the directory to lung x-ray images data without labels (png, jpeg, ...)