[Bug]: 模型权重加载时出现大量 UNEXPECTED 和 MISSING 参数

### Bug Description

在加载预训练模型 InternVLA-N1-DualVLN 时，日志显示大量参数状态为 UNEXPECTED 或 MISSING。虽然部分 UNEXPECTED 参数在架构不一致时可以忽略，但 MISSING 参数会被重新初始化。

### Steps to Reproduce

使用支持`sm_120`架构的torch版本重新配置环境后，仅对代码进行适配性api修改，然后运行`http_internvla_server.py`. 在控制台输出以下信息（https://huggingface.co/InternRobotics/InternVLA-N1-wo-dagger 和 https://huggingface.co/InternRobotics/InternVLA-N1-DualVLN 都会），

```
Loading weights: 100%|█████████████████████████████████████████████████████████████| 729/729 [00:06<00:00, 120.42it/s, Materializing param=model.visual.patch_embed.proj.weight]
InternVLAN1ForCausalLM LOAD REPORT from: checkpoints/InternVLA-N1-DualVLN
Key                                                                                                     | Status     |
--------------------------------------------------------------------------------------------------------+------------+-
model.language_model.traj_dit.model.language_model.layers.{0...11}.attn1.norm_q.bias                    | UNEXPECTED |
model.language_model.traj_dit.model.language_model.layers.{0...11}.attn2.to_k.weight                    | UNEXPECTED |
.....
model.action_decoder.weight                                                                             | MISSING    |
model.action_encoder.weight                                                                             | MISSING    |

Notes:
- UNEXPECTED    :can be ignored when loading from different task/architecture; not ok if you expect identical arch.
- MISSING       :those params were newly initialized because missing from the checkpoint. Consider training on your downstream task.
The image processor of type `Qwen2VLImageProcessor` is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with `use_fast=False`. Note that this behavior will be extended to all models in a future release.
The following generation flags are not valid and may be ignored: ['temperature', 'top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
```

### Expected Behavior

期望所有模型参数都能正确加载，没有 MISSING 参数，UNEXPECTED 参数应尽可能少或为零，以确保模型性能与预训练一致。

### Screenshots/Videos

_No response_

### Environment

- OS: Windows 10
- GPU: RTX5060ti
- GPU-driver version: 591.74


### Release version or Commit ID

#108 

### Additional Context

#97 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: 模型权重加载时出现大量 UNEXPECTED 和 MISSING 参数 #250

Bug Description

Steps to Reproduce

Expected Behavior

Screenshots/Videos

Environment

Release version or Commit ID

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: 模型权重加载时出现大量 UNEXPECTED 和 MISSING 参数 #250

Description

Bug Description

Steps to Reproduce

Expected Behavior

Screenshots/Videos

Environment

Release version or Commit ID

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions