-
Notifications
You must be signed in to change notification settings - Fork 3
Description
After training, I use the following script to test the result of the model on 8 * gpus.
#!/bin/bash
echo "Start running..."
export HF_ENDPOINT=https://hf-mirror.com
accelerate launch test_ccot.py
--model_name_or_path /media/cfs/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/outputs/pcot-llama1binst-lora-3-24/checkpoint-22000
--dataset_name ./data/whynlp-gsm8k-aug
--label_names labels
--lora_target_modules q_proj-k_proj-v_proj-o_proj-down_proj-up_proj-gate_pro
--lora_modules_to_save ""
--remove_unused_columns false
--per_device_train_batch_size 64
--per_device_eval_batch_size 64
--auto_find_batch_size
--block_size 1024
--bf16
--torch_dtype bfloat16
--do_eval
--do_predict
--report_to none
--run_name pcot-llama1binst-lora-3-24-test
--overwrite_output_dir
--output_dir outputs/test/pcot-llama1binst-lora-3-24
And I got the following error:
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/test_ccot.py", line 854, in
[rank0]: main()
[rank0]: File "/home/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/test_ccot.py", line 814, in main
[rank0]: metrics = eval_ccot(trainer.model, split="validation")
[rank0]: File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/home/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/test_ccot.py", line 794, in eval_ccot
[rank0]: decoded_tokens = model.generate(
[rank0]: File "/usr/local/miniconda3/lib/python3.10/site-packages/peft/peft_model.py", line 886, in generate
[rank0]: return self.get_base_model().generate(*args, **kwargs)
[rank0]: File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/home/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/models/generate.py", line 114, in generate
[rank0]: outputs = self.forward(**forward_inputs, return_dict=True)
[rank0]: File "/home/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/models/modeling_llama.py", line 127, in forward
[rank0]: shift_logits = shift_logits.view(-1, self.config.vocab_size)
[rank0]: RuntimeError: shape '[-1, 32000]' is invalid for input of size 1214878720