-
Notifications
You must be signed in to change notification settings - Fork 443
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug report
After finetuning the model, I used the following command to convert my MaxText Gemma3 Checkpoint to HuggingFace format:
python3 -m MaxText.utils.ckpt_conversion.to_huggingface src/MaxText/configs/base.yml \ model_name='gemma3-4b' \
hf_access_token={hf token} \
load_parameters_path={my checkpoint path} \
base_output_directory=/tmp/gemma3-4B-cpt-hf \
use_multimodal=false # tried both false & true \
scan_layers=false # cannot create scanned checkpoint (another bug)For testing purposes, I set up a vllm server on the TPU
pip3.11 install vllm-tpu
vllm serve /tmp/gemma3-4B-cpt-hf \
--disable-log-requests \
--tensor_parallel_size=4 \
--api-key {my api key}Error message from vllm
ValueError: Gemma3 uses `gelu_pytorch_tanh` as the hidden activation function. Please set `hidden_act` and `hidden_activation` to `gelu_pytorch_tanh`.
MaxText Conversion config uses "hidden_activation": "gelu" for text generation
huggingface config for text generation uses gelu_pytorch_tanh
vllm explicitly check for gelu_pytorch_tanh which resulted in the error above
Logs/Output
Environment Information
TPU creation command
export TPU_COUNT=4
gcloud alpha compute tpus tpu-vm create $TPU_ID \
--zone="${ZONE}" \
--accelerator-type="v6e-${TPU_COUNT}" \
--version=v2-alpha-tpuv6e \
--spot \
--service-account=${service_account}Tried installation from source and uv as suggested in the documentation of MaxText.
Additional Context
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working