Strange prediction

Hello, thank you for sharing your code. I encountered some issues while running the inference code. The video I tested with is 'v_0A6fEUxdDMk.mp4' from the test set of the VATEX dataset, which is a video of a chef making sushi.

Firstly, I downloaded the pre-trained parameters from [luoruipu1/Valley2-7b](https://huggingface.co/luoruipu1/Valley2-7b) and ran the `run_valley_llamma_v2.py` file. The user prompt I used was `<video> Describe the video concisely`, and the answer I got was "['10. Can you describe the scene in the video']";

Then, I used these parameters to run the `run_valley.py` file, and the answer I received was "['10.']";

I'm not sure why this is happening. I haven't modified any code. Could it be that I used the wrong parameters or the wrong prompt format?

Subsequently, I re-downloaded the parameters from [Zhaoziwang/chinese_valley7b_v1](https://huggingface.co/Zhaoziwang/chinese_valley7b_v1) and attempted to run the `run_valley.py` code. When I used the user prompt `"请描述这个视频\n<video>"`, the returned result was an empty string. When I modified the prompt to` "<video>请描述这个视频\n"`, the result was repeated garbage characters.

How can I correctly run the valley model?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strange prediction #37

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Strange prediction #37

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions