-
Notifications
You must be signed in to change notification settings - Fork 80
Benchmark: Model benchmark - deterministic training support #731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Aishwarya-Tonpe
wants to merge
96
commits into
main
Choose a base branch
from
aishwaryatonpe/deterministic-training
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add _enable_deterministic_training() method to set all necessary seeds - Add --deterministic and --random_seed command line arguments - Integrate deterministic training in _create_model() and _generate_dataset() - Add comprehensive unit tests for deterministic functionality - Tests validate parameter parsing, functionality, and regression scenarios - All tests pass and integrate with existing SuperBench test suite
…pass check_frequency to _is_finished in train/infer; add test capturing checksum log; stabilize fp32 loss path and small-dims determinism tests
…oss BERT/GPT2/CNN/LSTM/Mixtral; per-step fp32 loss logging; checksum logs; tests updated to strict/soft determinism pattern; add strict determinism CI guidance
…rings; fix GPT-2 params; soft vs strict checks stabilized
…sum tests with BERT pattern, improve docstrings and skip logic.
…BERT, GPT-2, LSTM, CNN, LLaMA examples
… models; update tests
…/CNN/BERT/Mixtral with periodic fingerprints, per-step loss capture, TF32 off, SDPA math kernel; add model_log_utils; update examples and tests, add env gating for cuBLAS.
…ted example file, remove redundant code
… unnecessary code
…idual model classes
… reduce redundant code
Author
@microsoft-github-policy-service agree company="Microsoft" |
…to aishwaryatonpe/deterministic-training
…nd inference steps
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…etadata overriding
…es not need to be set explicitly before running the benchmarks
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…b.com/microsoft/superbenchmark into aishwaryatonpe/deterministic-training
…tcases pass on local
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Support for deterministic training and reproducible logging to all PyTorch model benchmarks in SuperBench (BERT, GPT2, LLaMA, LSTM, CNN, Mixtral).
Deterministic mode: Makes sure model runs are consistent every time by fixing random seeds, turning off TF32, and using stable math operations.
Log generation: Saves key info like loss and activation stats during training.
Log comparison: Lets you compare a new run with a previous one to check if they match.
New command-line options:
--enable-determinism
--generate-log {boolean flag which when enabled, stores the metrics (loss and activation mean) to the results file}
--compare-log {log path of the json file against which you want to compare the results of the current run}
--check-frequency
Changes -
Updated pytorch_base.py to handle deterministic settings, logging, and comparisons.
Added a new example script: pytorch_deterministic_example.py
Added a test file: test_pytorch_determinism_all.py to verify everything works as expected.
Usage -
Run with --enable-determinism --generate-log to create a reference log.
Run again with --compare-log to check if the new run matches the reference.
Make sure all parameters stay the same between runs.