You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I understand correctly, adding bias to q_proj should not affect computational invariance. This is because we only multiply orthogonal matrices in the $XW$ part. Thus, we have $Y=XW + b = (XR)(R^{T}W) + b$.
Hi, great work!
If q/k/v_proj has bias in my model, does it affect computational invariance?
The text was updated successfully, but these errors were encountered: