Conversation
There was a problem hiding this comment.
Pull request overview
Updates model-specific AITK documentation and profiling dependencies to reflect newer quantization/runtime options, and adjusts repo copy/check metadata accordingly.
Changes:
- Refresh several model READMEs: rename AMD NPU workflow to “Quark Quantization”, add “int4 Quantization for QNN GPU”, and remove the AutoAWQ mention in the DML workflow.
- Add
onnxruntime-genai-winml==0.11.2to the profiling requirements set. - Remove a README copy step from
meta-llama-Llama-3.1-8B-Instruct’s_copy.json.configand decrementcopyCheckaccordingly.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| microsoft-Phi-3.5-mini-instruct/aitk/README.md | Updates workflow list and prerequisites text. |
| meta-llama-Llama-3.2-1B-Instruct/aitk/README.md | Updates workflow list and prerequisites text. |
| deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B/aitk/README.md | Updates workflow list and prerequisites text. |
| Qwen-Qwen2.5-1.5B-Instruct/aitk/README.md | Updates workflow list and prerequisites text. |
| meta-llama-Llama-3.1-8B-Instruct/aitk/_copy.json.config | Stops copying README from another model template. |
| .aitk/requirements/requirements-Profiling.txt | Adds onnxruntime-genai-winml to profiling environment. |
| .aitk/configs/checks.json | Updates expected copy check count. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - Quark Quantization for AMD NPU | ||
| - PTQ + AOT for QNN NPU | ||
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** | ||
| - Int4 Quantization for QNN GPU |
There was a problem hiding this comment.
The workflow list now mentions “Quark Quantization for AMD NPU” and “int4 Quantization for QNN GPU”, but this README doesn’t include any corresponding sections/usage guidance (and there’s no other mention of Quark/QNN GPU later). Either add links/sections that explain how to run these workflows (e.g., which *.json.config to execute), or remove the bullets to avoid advertising unsupported steps.
| - Quark Quantization for AMD NPU | |
| - PTQ + AOT for QNN NPU | |
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** | |
| - Int4 Quantization for QNN GPU | |
| - PTQ + AOT for QNN NPU | |
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** |
| + This process uses AutoAWQ and ModelBuilder | ||
| + This process uses ModelBuilder | ||
|
|
||
| **For some python packages, users need to install visual studio 2022 or visual studio 2022 build tools with c++ development tools modules.** |
There was a problem hiding this comment.
Capitalize product/term names in this prerequisite sentence for readability/accuracy (Python, Visual Studio 2022, Build Tools, C++). Also consider using the official Visual Studio wording (“C++ development workload/tools”) rather than “modules”.
| **For some python packages, users need to install visual studio 2022 or visual studio 2022 build tools with c++ development tools modules.** | |
| **For some Python packages, users need to install Visual Studio 2022 or Visual Studio 2022 Build Tools with the C++ development workload (or C++ build tools) installed.** |
| - Quark Quantization for AMD NPU | ||
| - PTQ + AOT for QNN NPU | ||
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** | ||
| - Int4 Quantization for QNN GPU |
There was a problem hiding this comment.
The workflow list now mentions “Quark Quantization for AMD NPU” and “int4 Quantization for QNN GPU”, but this README doesn’t include any corresponding sections/usage guidance (and there’s no other mention of Quark/QNN GPU later). Either add links/sections that explain how to run these workflows (e.g., which *.json.config to execute), or remove the bullets to avoid advertising unsupported steps.
| - Quark Quantization for AMD NPU | |
| - PTQ + AOT for QNN NPU | |
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** | |
| - Int4 Quantization for QNN GPU | |
| - PTQ + AOT for QNN NPU | |
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** |
| + This process uses AutoAWQ and ModelBuilder | ||
| + This process uses ModelBuilder | ||
|
|
||
| **For some python packages, users need to install visual studio 2022 or visual studio 2022 build tools with c++ development tools modules.** |
There was a problem hiding this comment.
Capitalize product/term names in this prerequisite sentence for readability/accuracy (Python, Visual Studio 2022, Build Tools, C++). Also consider using the official Visual Studio wording (“C++ development workload/tools”) rather than “modules”.
| **For some python packages, users need to install visual studio 2022 or visual studio 2022 build tools with c++ development tools modules.** | |
| **For some Python packages, users need to install Visual Studio 2022 or Visual Studio 2022 Build Tools with the C++ development tools workload.** |
| - Quark Quantization for AMD NPU | ||
| - PTQ + AOT for QNN NPU | ||
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** | ||
| - Int4 Quantization for QNN GPU |
There was a problem hiding this comment.
The workflow list now mentions “Quark Quantization for AMD NPU” and “int4 Quantization for QNN GPU”, but this README doesn’t include any corresponding sections/usage guidance (and there’s no other mention of Quark/QNN GPU later). Either add links/sections that explain how to run these workflows (e.g., which *.json.config to execute), or remove the bullets to avoid advertising unsupported steps.
| - Quark Quantization for AMD NPU | |
| - PTQ + AOT for QNN NPU | |
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** | |
| - Int4 Quantization for QNN GPU | |
| - PTQ + AOT for QNN NPU | |
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** |
| + This process uses AutoAWQ and ModelBuilder | ||
| + This process uses ModelBuilder | ||
|
|
||
| **For some python packages, users need to install visual studio 2022 or visual studio 2022 build tools with c++ development tools modules.** |
There was a problem hiding this comment.
Capitalize product/term names in this prerequisite sentence for readability/accuracy (Python, Visual Studio 2022, Build Tools, C++). Also consider using the official Visual Studio wording (“C++ development workload/tools”) rather than “modules”.
| **For some python packages, users need to install visual studio 2022 or visual studio 2022 build tools with c++ development tools modules.** | |
| **For some Python packages, users need to install Visual Studio 2022 with the C++ development workload or Visual Studio 2022 Build Tools with the C++ build tools.** |
| - Quark Quantization for AMD NPU | ||
| - PTQ + AOT for QNN NPU | ||
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** | ||
| - Int4 Quantization for QNN GPU |
There was a problem hiding this comment.
The workflow list now mentions “Quark Quantization for AMD NPU” and “int4 Quantization for QNN GPU”, but this README doesn’t include any corresponding sections/usage guidance (and there’s no other mention of Quark/QNN GPU later). Either add links/sections that explain how to run these workflows (e.g., which *.json.config to execute), or remove the bullets to avoid advertising unsupported steps.
| - Quark Quantization for AMD NPU | |
| - PTQ + AOT for QNN NPU | |
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** | |
| - Int4 Quantization for QNN GPU | |
| - PTQ + AOT for QNN NPU | |
| + This process extends the QDQ flow and compiling specifically for **Qualcomm NPUs** |
| + This process uses AutoAWQ and ModelBuilder | ||
| + This process uses ModelBuilder | ||
|
|
||
| **For some python packages, users need to install visual studio 2022 or visual studio 2022 build tools with c++ development tools modules.** |
There was a problem hiding this comment.
Capitalize product/term names in this prerequisite sentence for readability/accuracy (Python, Visual Studio 2022, Build Tools, C++). Also consider using the official Visual Studio wording (“C++ development workload/tools”) rather than “modules”.
| **For some python packages, users need to install visual studio 2022 or visual studio 2022 build tools with c++ development tools modules.** | |
| **For some Python packages, users need to install Visual Studio 2022 with the C++ development workload, or Visual Studio 2022 Build Tools with the C++ build tools installed.** |
No description provided.