-
Notifications
You must be signed in to change notification settings - Fork 72
Open
Description
Bug Description
在加载预训练模型 InternVLA-N1-DualVLN 时,日志显示大量参数状态为 UNEXPECTED 或 MISSING。虽然部分 UNEXPECTED 参数在架构不一致时可以忽略,但 MISSING 参数会被重新初始化。
Steps to Reproduce
使用支持sm_120架构的torch版本重新配置环境后,仅对代码进行适配性api修改,然后运行http_internvla_server.py. 在控制台输出以下信息(https://huggingface.co/InternRobotics/InternVLA-N1-wo-dagger 和 https://huggingface.co/InternRobotics/InternVLA-N1-DualVLN 都会),
Loading weights: 100%|█████████████████████████████████████████████████████████████| 729/729 [00:06<00:00, 120.42it/s, Materializing param=model.visual.patch_embed.proj.weight]
InternVLAN1ForCausalLM LOAD REPORT from: checkpoints/InternVLA-N1-DualVLN
Key | Status |
--------------------------------------------------------------------------------------------------------+------------+-
model.language_model.traj_dit.model.language_model.layers.{0...11}.attn1.norm_q.bias | UNEXPECTED |
model.language_model.traj_dit.model.language_model.layers.{0...11}.attn2.to_k.weight | UNEXPECTED |
.....
model.action_decoder.weight | MISSING |
model.action_encoder.weight | MISSING |
Notes:
- UNEXPECTED :can be ignored when loading from different task/architecture; not ok if you expect identical arch.
- MISSING :those params were newly initialized because missing from the checkpoint. Consider training on your downstream task.
The image processor of type `Qwen2VLImageProcessor` is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with `use_fast=False`. Note that this behavior will be extended to all models in a future release.
The following generation flags are not valid and may be ignored: ['temperature', 'top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
Expected Behavior
期望所有模型参数都能正确加载,没有 MISSING 参数,UNEXPECTED 参数应尽可能少或为零,以确保模型性能与预训练一致。
Screenshots/Videos
No response
Environment
- OS: Windows 10
- GPU: RTX5060ti
- GPU-driver version: 591.74
Release version or Commit ID
Additional Context
ZhangzrJerry
Metadata
Metadata
Assignees
Labels
No labels