You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your work! Is positional encoding injected at every layer of the DIT? If so, does the z-axis coordinate not need to change during the diffusion process of perspective transformation? Wouldn't not changing it mislead the model?