Fix Batch Size Mismatch When Using crops_n_layers
in mask-generation
Pipeline #35530
#35627
86
−27
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This pull request addresses a batch size mismatch issue that arises when users set the parameter
crops_n_layers > 0
in the Segment Anything Model (SAM) poweredmask-generation
pipeline. Previously, attempting to run the pipeline caused a runtime error, as the model would receive mismatched batch dimensions for image embeddings and input points. fixes issue #35530 . @Rocketknight1 kindly review this at the earliest .Fixes #35530
The Problem
When segmenting images in "segment everything" mode with cropping enabled (
crops_n_layers=1
or more), the pipeline splits the image into multiple crops. It generates a grid of points for each crop to compute masks in parallel. In some scenarios, particularly where the input is not batched before the cropping procedure, the pipeline would produce additional crops as separate batch entries. This led to:• A mismatch in the batch dimension between the generated image embeddings and the corresponding point embeddings.
• A ValueError reporting different batch sizes, preventing the pipeline from completing the segmentation process.
Proposed Solution
We introduced internal improvements that unify the handling of image crops and points so they remain synchronized:
• Ensured that all crops from a single original image preserve the same batch index.
• Correctly merged the batched embeddings back together, with consistent shapes, so each batch item has the correct number of crops and associated point embeddings.
Approach
Screenshots
Before
After
Prior to this fix, the
Before
screenshot would visibly show an error stack trace indicating the batch mismatch. TheAfter
screenshot demonstrates a smooth run, producing valid masks across cropped regions.