Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
GPU Info: NVIDIA GeForce RTX 3090
Model used: rtmdet_l_8xb32-300e_coco.py
Deployment cfg file used: detection_tensorrt-fp16_static-640x640.py -> you may find examples here
Average inference times(Just the _call_ method of the wrappers used):
Average inference times(get_sliced_prediction function):
The difference was around 400 ms but it might yield significant speed improvement for edge devices such as jetson.
Disclaimer: This implementation is not perfect as it could be generalized to work for other frameworks(yolo, detectron2, etc.). I needed this for my current project and wanted to share it as a base for anyone interested.
Example usage: