Hey Lukas,
Hope you're well :) I believe the weights you learned for the OpenCLIP ViT-L/14 laion400m_e32 are for the module visual.ln_post and not visual as the folder name suggests in this repo.
Using visual.ln_post with this model in thingsvision returns 768 dimensional features, whereas visual returns 512 dimensional features. The weight matrices in transforms/OpenCLIP_ViT-L-14_laion400m_e32/visual/transform.npz are 768 by 768, matching visual.ln_post
I'm pointing this out because the mismatch of names causes issues when calling the .align method in thingsvision
cheers :)
Can