z-axis coordinate question

Thank you for your work! Is positional encoding injected at every layer of the DIT? If so, does the z-axis coordinate not need to change during the diffusion process of perspective transformation? Wouldn't not changing it mislead the model?