Skip to content

Bug when running test_ccot.py #6

@scarydemon2

Description

@scarydemon2

After training, I use the following script to test the result of the model on 8 * gpus.

#!/bin/bash
echo "Start running..."
export HF_ENDPOINT=https://hf-mirror.com
accelerate launch test_ccot.py
--model_name_or_path /media/cfs/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/outputs/pcot-llama1binst-lora-3-24/checkpoint-22000
--dataset_name ./data/whynlp-gsm8k-aug
--label_names labels
--lora_target_modules q_proj-k_proj-v_proj-o_proj-down_proj-up_proj-gate_pro
--lora_modules_to_save ""
--remove_unused_columns false
--per_device_train_batch_size 64
--per_device_eval_batch_size 64
--auto_find_batch_size
--block_size 1024
--bf16
--torch_dtype bfloat16
--do_eval
--do_predict
--report_to none
--run_name pcot-llama1binst-lora-3-24-test
--overwrite_output_dir
--output_dir outputs/test/pcot-llama1binst-lora-3-24

And I got the following error:

[rank0]: Traceback (most recent call last):
[rank0]: File "/home/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/test_ccot.py", line 854, in
[rank0]: main()
[rank0]: File "/home/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/test_ccot.py", line 814, in main
[rank0]: metrics = eval_ccot(trainer.model, split="validation")
[rank0]: File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/home/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/test_ccot.py", line 794, in eval_ccot
[rank0]: decoded_tokens = model.generate(
[rank0]: File "/usr/local/miniconda3/lib/python3.10/site-packages/peft/peft_model.py", line 886, in generate
[rank0]: return self.get_base_model().generate(*args, **kwargs)
[rank0]: File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/home/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/models/generate.py", line 114, in generate
[rank0]: outputs = self.forward(**forward_inputs, return_dict=True)
[rank0]: File "/home/products-understanding-nlp/gaotianhao/reasoning_embedding/latent_cot/PCCoT/models/modeling_llama.py", line 127, in forward
[rank0]: shift_logits = shift_logits.view(-1, self.config.vocab_size)
[rank0]: RuntimeError: shape '[-1, 32000]' is invalid for input of size 1214878720

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions