This is the official repository for MedSAM: Segment Anything in Medical Images.
- Create a virtual environment
conda create -n medsam python=3.10 -y
and activate itconda activate medsam
- Install Pytorch 2.0
git clone https://github.com/bowang-lab/MedSAM
- Enter the MedSAM folder
cd MedSAM
and runpip install -e .
We provide a step-by-step tutorial with a small dataset to help you quickly start the training process.
Download SAM checkpoint and place it at work_dir/SAM/sam_vit_b_01ec64.pth
.
Download the demo dataset and unzip.
This dataset contains 50 abdomen CT scans and each scan contain an annotation mask with 13 organs. The names of the organ label are available at MICCAI FLARE2022. In this tutorial, we will fine-tune SAM for gallbladder segmentation.
Run pre-processing
python pre_CT.py -i path_to_image_folder -gt path_to_gt_folder -o path_to_output
- split dataset: 80% for training and 20% for testing
- image normalization
- pre-compute image embedding
- save the normalized images, ground truth masks, and image embedding as a
npz
file
Note: Medical images have various data formats. Thus, it's impossible that one script can handle all these different formats. Here, we provide two typical examples for CT and non-CT (e.g., various MR sequences, PET images) image preprocessing. You can adapt the preprocessing code to your own datasets.
Model Training (Video Tutorial)
Please check the step-by-step tutorial: finetune_and_inference_tutorial_3D_dataset.ipynb
We also provide a tutorial on 2D dataset (png format): finetune_and_inference_tutorial_2D_dataset.ipynb
You can also train the model on the whole dataset.
- Download the training set (GoogleDrive)
Note: For the convenience of file sharing, we compress each image and mask pair in a
npz
file. The pre-computed image embedding is too large (require ~1 TB space). You can generate it with the following command
- Pre-compute the image embedding and save the image embedding and ground truth as
.npy
files.
python utils/precompute_img_embed.py -i path_to_train_folder -o ./data/Tr_npy
- Train the model
python train -i ./data/Tr_npy --task_name SAM-ViT-B --num_epochs 1000 --batch_size 8 --lr 1e-5
If you find this dataset valuable in your research, kindly acknowledge and credit the original data sources: AMOS, BraTS2021, ACDC, M&Ms, PROMISE12 ABCs, AbdomenCT-1K, MSD, KiTS19, LiTS, COVID-19 CT-Seg, HECKTOR DRIVE, Colon gland, polyp, instruments, Abdomen Ultrasound, Breast Ultrasound, JSRT
- Train the model without pre-computed embeddings, run the following command:
python train_no_npz.py --csv <path-to-csv-file> --image_col <csv-image-column-name> --mask_col <csv-mask-column-name> --model_type vit_b --checkpoint ../SAM_weights/sam_vit_b_01ec64.pth [--image <image-file-dir-path>] [--mask <mask-file-dir-path>]--num_epochs 100 --batch_size 4 --lr 1e-4
The --image
and --mask
arguments can be used to specify the paths to the input and mask images, respectively. If these arguments are not specified, the paths to the images will be taken from the CSV file.
The --image_col
and --mask_col
arguments can be used to specify the names of the columns in the CSV file that contain the paths to the input and mask images
The -k
argument can be used to specify the number of folds for cross-validation. If this argument is not specified, the model will be trained on the entire dataset.
Note: This method is slower and requires more memory than training the model using pre-computed embeddings.
Download the model checkpoint (GoogleDrive) and testing data (GoogleDrive) and put them to data/Test
and work_dir/MedSAM
respectively.
Run
python MedSAM_Inference.py -i ./data/Test -o ./ -chk work_dir/MedSAM/medsam_20230423_vit_b_0.0.1.pth
The segmentation results are available at here.
The implementation code of DSC and NSD can be obtained here.
- Train the ViT-H model
- Explore other fine-tuning methods, e.g., fine-tune the image encoder as well, lora fine-tuning
- Support scribble prompts
- Support IoU/DSC regression
- Enlarge the dataset
- 3D slicer and napari support
We are excited about the potential of segmentation foundation models in the medical image domain. However, training such models requires extensive computing resources. Therefore, we have made all the pre-processed training and images publicly available for research purposes. To prevent duplication of effort (e.g., conduct the same experiemnts), we encourage sharing of results and trained models on the discussion page. We look forward to working with the community to advance this exciting research area.
- We highly appreciate all the challenge organizers and dataset owners for providing the public dataset to the community.
- We thank Meta AI for making the source code of segment anything publicly available.
- We also thank Alexandre Bonnet for sharing this great blog
@article{MedSAM,
title={Segment Anything in Medical Images},
author={Ma, Jun and Wang, Bo},
journal={arXiv preprint arXiv:2304.12306},
year={2023}
}