cd checkpoints && \
./download_ckpts.sh && \
cd ..
onnx
torch 2.2.1
onnx 1.16.2
tflite
torch 2.4.0
ai-edge-torch 0.2.0
tf-nightly 2.18.0.dev20240811 for image mode
tf-nightly 2.18.0.dev20240905 for video mode
onnx
python3 export_image_predictor.py --framework onnx
python3 export_video_predictor.py --framework onnx
tflite
export PJRT_DEVICE=CPU
python3 export_image_predictor.py --framework tflite
python3 export_video_predictor.py --framework tflite
onnx
download_onnx_models.sh
python3 export_image_predictor.py --framework onnx --mode import
python3 export_video_predictor.py --framework onnx --mode import
tflite
download_tflite_models.sh
python3 export_image_predictor.py --framework tflite --mode import
python3 export_video_predictor.py --framework tflite --mode import
python3 export_image_predictor.py --framework tflite --mode import --image_size 512
python3 export_video_predictor.py --framework tflite --mode import --image_size 512
Replacing the complex tensor of RotaryEnc with matmul. To test this behavior, you can also run it with torch.
python3 export_video_predictor.py --framework torch
The deliverables will be stored below.
output/*
model/*
You can also download it from the following.
- https://storage.googleapis.com/ailia-models/segment-anything-2/image_encoder_hiera_t.onnx
- https://storage.googleapis.com/ailia-models/segment-anything-2/prompt_encoder_hiera_t.onnx
- https://storage.googleapis.com/ailia-models/segment-anything-2/mask_decoder_hiera_t.onnx
- https://storage.googleapis.com/ailia-models/segment-anything-2/memory_encoder_hiera_t.onnx
- https://storage.googleapis.com/ailia-models/segment-anything-2/mlp_hiera_t.onnx
- https://storage.googleapis.com/ailia-models/segment-anything-2/memory_attention_hiera_t.onnx (6dim matmul, batch = N)
- https://storage.googleapis.com/ailia-models/segment-anything-2/memory_attention_hiera_t.opt.onnx (4dim matmul, batch = 1)
(The model of the Prompt Encoder was replaced on 2024/12/19 due to a problem found in the Prompt Encoder.)
- https://storage.googleapis.com/ailia-models-tflite/segment-anything-2/image_encoder_hiera_t.tflite
- https://storage.googleapis.com/ailia-models-tflite/segment-anything-2/prompt_encoder_hiera_t.tflite
- https://storage.googleapis.com/ailia-models-tflite/segment-anything-2/mask_decoder_hiera_t.tflite
- https://storage.googleapis.com/ailia-models-tflite/segment-anything-2/mlp_hiera_t.tflite
- https://storage.googleapis.com/ailia-models-tflite/segment-anything-2/memory_encoder_hiera_t.tflite
- https://storage.googleapis.com/ailia-models-tflite/segment-anything-2/memory_attention_hiera_t.tflite (4dim matmul, batch = 1, num_maskmem = 1)
The memory attention in tflite does not support dynamic shapes, so num_maskmem and max_obj_ptrs_in_encoder need to be fixed to 1.
(The model of the Prompt Encoder was replaced on 2024/12/19 due to a problem found in the Prompt Encoder.)
main
https://github.com/axinc-ai/segment-anything-2/tree/f36169e87ec302c75279fadc60cda1c3763165eb