-
Notifications
You must be signed in to change notification settings - Fork 174
Open
Description
Basic Information - Models Used
MiniMax2.1
Information about environment and deployment
CUDA_VISIBLE_DEVICES=2,3,4,5
vllm serve /media/1/models/MiniMax-M2.1
--served-model-name "self-pp-minimax-2-1"
--port 5011
--host 0.0.0.0
--max-num-seqs 32
--tensor-parallel-size 4
--gpu-memory-utilization 0.95
--max-model-len 196608
--tool-call-parser minimax_m2
--reasoning-parser minimax_m2
--enable-auto-tool-choice
--trust-remote-code >logs/minimax_0107.log 2>&1 &
Description
Steps to reproduce
The bug can be reproduced with the following steps:
Expected behavior
Error logs
Paste the related screenshots here
Metadata
Metadata
Assignees
Labels
No labels