-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
assert num_buckets == self.num_buckets error #220
Comments
Hi @gudrb , thanks for your attention to our work! Does the class token exist in the fine-tuned model? If the class token exists If not, namely |
Thank you for answering, I am not using class token, but still i tried to use tried to use skip=1 option, and it gives the key error when i load the pretrained model I tried to use random variable and observed the blk fuction such as
and i found when i change the second dimension of variable x to another value such as 196 -> N (not 196), then i get the error Is it possible to use a pretrained model that utilizes IRPE for a different sequence length, such as a varying number of patches? |
Yes. You need to pass the two arguments iRPE is a 2D relative position encoding. If https://github.com/microsoft/Cream/blob/main/MiniViT/Mini-DeiT/irpe.py#L553 |
Now, it is working. I modified the code for the MiniAttention class from (
I hope this is the correct way to utilize the MiniAttention class when fine-tuning the task with a different sequence length. Thank you. |
Do I need to crop or interpolate pretrained relative positional encoding parameters when the sequence length is changed? When I use the pretrained Mini-DeiT with positional encodings (both absolute and relative), in the case of absolute positional encoding, if the modified sequence length is shorter or longer than 14, I employ cropping and interpolation, respectively.
|
@gudrb No. You don't. Relative position encoding can be adapted with the longer sequence. |
I am trying to use the mini_deit_tiny_patch16_224 with finetuning another subtask having different sequence size of 18 (num of patches) with dimension 192.
when operate under code
for blk in self.blocks:
x = blk(x)
i get the error from irpe.py file's in line 574 code "assert num_buckets == self.num_buckets"
num_buckets is 50 but self.num_buckets is 49.
Do u know why this problem happens and how can i fix it?
The text was updated successfully, but these errors were encountered: