-
Notifications
You must be signed in to change notification settings - Fork 135
[fix] resume partial tests for inference and serving #1054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@zihugithub Looks like GitHub cannot determine your identity from your GIT PR. Can you help double check your email used for the PR ( |
|
@zihugithub There are conflicts to be resolved. |
5d7b7d5 to
06f14b0
Compare
| { echo 'runs_on<<EOFRUNSON'; echo "$RUNNER_LABELS"; echo 'EOFRUNSON'; } >> $GITHUB_OUTPUT | ||
| { echo 'container_volumes<<EOFVOLUMES'; echo "$VOLUMES"; echo 'EOFVOLUMES'; } >> $GITHUB_OUTPUT | ||
| { echo 'unit_subsets<<EOFUNITSUBSETS'; echo "$UNIT_SUBSETS"; echo 'EOFUNITSUBSETS'; } >> $GITHUB_OUTPUT | ||
| { echo 'unit_train_subsets<<EOFUNITSUBSETS'; echo "$UNIT_TRAIN_SUBSETS"; echo 'EOFUNITSUBSETS'; } >> $GITHUB_OUTPUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not using the simpler syntax like we did on line 74?
| "cd ${GITHUB_WORKSPACE}/Megatron-LM-FL" | ||
| "pip install --no-build-isolation . -vvv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this change work?
| # install FlagGems | ||
| INSTALL_FLAGGEMS=( | ||
| "pip install scikit-build scikit-build-core" | ||
| "pip install git+https://github.com/FlagOpen/FlagGems.git@v3.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "pip install git+https://github.com/FlagOpen/FlagGems.git@v3.0" | |
| "pip install git+https://github.com/flagos-ai/FlagGems.git@v3.0" |
| "pip install gymnasium" | ||
| "pip install dm-tree" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can/shall we pin the package version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall/can we expose these dependencies in a more formal way?
I mean ... these packages are not just required by the workflow, they are required for the software to run. These requirements thus should be declared by the software, not the workflow.
6cf4f3b to
b1db41c
Compare
| experiment: | ||
| exp_name: deepseek_r1_distill_qwen-flaggems | ||
| exp_dir: tests/functional_tests/test_cases/inference/deepseek_r1_distill_qwen-flaggems/results_test/7b-tp2 | ||
| exp_dir: tests/functional_tests/inference/deepseek_r1_distill_qwen-flaggems/results_test/7b-tp2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename tests/functional_tests/ to tests/functional/.
Similarly, rename tests/test_utils/ to tests/utils/.
You get simplicity without losing anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tengqm It must align with the test directory structure of FlagScale, which uses the naming of function_tests. We will stick to this convention unless modification is deemed necessary.
|
|
||
| # Compare actual output and golden reference output line by line (ignoring newline character differences) | ||
| for result_line, gold_value_line in zip(result_lines, gold_value_lines): | ||
| print(result_line, gold_value_line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe remove line 323?
| # Concatenate the VLLM OpenAI-compatible interface URL | ||
| url = f"http://localhost:{deploy_config.port}/v1/completions" | ||
|
|
||
| # Set request headers (JSON format) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't have to write comments for lines that speak for themselves.
| if key_value.startswith("val-core/openai/gsm8k/reward/mean"): | ||
| # Split by colon and extract the numeric part after the colon | ||
| value = key_value.split(":")[-1] | ||
| # Convert string value to float and add to the result list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are simply too many unnecessary comments like this in the code.
|
|
||
| jobs: | ||
| functional_test: | ||
| name: functional-${{ inputs.device }}-${{ inputs.task }}-${{ inputs.model }}--${{ inputs.case }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too redundant
c76e1e0 to
32a1a06
Compare
| fi | ||
| else | ||
| echo "ℹ️ Running tests with system Python" | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it make a lot of senses to use conda in test workflow.
| pip uninstall -y flash_attn | ||
| pip install flash-attn==2.6.3 --no-build-isolation | ||
| python -c "import flash_attn; import flash_attn.layers; print('FlashAttention loaded successfully, version:', flash_attn.__version__)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't install individual packages in workflow.
All requirements should be declared in the pyproject.toml file.
Even if some packages are only needed for functional tests, we can declare them as optional dependencies, e.g.
[project.optional-dependencies]
functional_test = [
'foo==1.3.4',
]We can install them using a one command uv pip install .[functional_test].
The command is reusable anywhere, while the dependency is centrally managed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a special use case that is incompatible with other use case environments, and will be modified later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is why pyproject.toml has an optional-dependencies section.
tests/functional_tests/inference/deepseek_r1_distill_qwen-flaggems/results_gold/7b-tp2
Show resolved
Hide resolved
| log_info "Installing vllm-FL" | ||
|
|
||
| # Clone repository | ||
| retry_git_clone "$vllm_url" "$vllm_dir" "$RETRY_COUNT" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mechanism is somehow fragile.
We are hooking the stability of FlagScale to that of vllm-FL.
We have to anticipate a scenario where FlagScale works fine, but there is a bug in the vllm-FL head.
To isolate errors like this, I'd suggest we pin a release tag of the plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tengqm We will do it later when the release version of vllm-fl is ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't have to be a formal release. It can be as simple as a tag, such as 20250129 or something.
Later on, when things do not work as expected, we will know where to start.
| log_info "Installing vllm-FL" | ||
|
|
||
| # Clone repository | ||
| retry_git_clone "$vllm_url" "$vllm_dir" "$RETRY_COUNT" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well ... again, we replicate the same script, why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the current perspective, the server and inference run in the same environment. This is done to distinguish between serve and inference, and the common source code will be installed and stored uniformly in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is something like serve is a superset of inference.
They are no conflicts between these two use cases besides this.
Am I understanding this correctly?
|
|
||
| # support 0.5b_multiple_instance ci test | ||
| ray==2.49.1 | ||
| gymnasium |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this referenced?
| # support 0.5b_multiple_instance ci test | ||
| ray==2.49.1 | ||
| gymnasium | ||
| dm-tree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this referenced?
| run: | | ||
| git config --global --add safe.directory $PROJECT_ROOT | ||
| - name: Check sccache installation and get path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we installing sccache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly used for compiling VLLM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I need some more contexts on this to understand it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tengqm vLLM will be compiled multiple times in different jobs repeatedly; introducing sccache is to reduce this compilation time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Darryl233 Thanks for the clarification. Is the 'compilation' a build time thing or a run time thing?
Darryl233
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR Category
CICD
PR Types
Others
PR Description