Add Qwen-3 0.6B CPU recipe by kunal-vaishnavi · Pull Request #233 · microsoft/olive-recipes

kunal-vaishnavi · 2026-02-09T20:26:50Z

Description

This PR adds a recipe for Qwen-3 0.6B on the CPU EP.

Motivation and Context

The Qwen-3 0.6B variant that uses KLD gradient with a ratio of 0.65 shows the best tradeoff for MMLU vs. model size when quantized. The recipes in the broader PR are for other variants.

Results

Sample size = 1000

Model	Hardware	Quantization Algo	Model Size	MMLU Score	Δ vs FP32
Qwen-3 0.6B	CPU	None (FP32)	2.83 GB	46.2	—
Qwen-3 0.6B	CPU	k_quant_down	984 MB	42.7	-3.5 (-7.6%)
Qwen-3 0.6B	CPU	kld_gradient (0.75)	569 MB	41.5	-4.7 (-10.2%)
Qwen-3 0.6B	CPU	kld_gradient (0.65)	598 MB	43.3	-2.9 (-6.3%)
Qwen-3 0.6B	CPU	dq	476 MB	42.4	-3.8 (-8.2%)

Copilot

Pull request overview

Adds a new model folder for Qwen/Qwen3-0.6B with Olive recipes targeting CPU, CUDA, and WebGPU execution providers, intended to complement the quantized Qwen-3 recipes added elsewhere.

Changes:

Added CPU (FP32) ModelBuilder recipe + metadata/README.
Added CUDA (FP16) ModelBuilder recipe + metadata/README.
Added WebGPU (FP16) ModelBuilder recipe + metadata/README, plus a model-local LICENSE file.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
Qwen-Qwen3-0.6B/LICENSE	Adds Apache-2.0 license text for the model folder.
Qwen-Qwen3-0.6B/cpu/info.yaml	Registers the CPU FP32 recipe for discovery.
Qwen-Qwen3-0.6B/cpu/README.md	Documents how to run the CPU recipe.
Qwen-Qwen3-0.6B/cpu/Qwen-Qwen3-0.6B_cpu_fp32.json	Olive config to export/build FP32 on CPU EP.
Qwen-Qwen3-0.6B/cuda/info.yaml	Registers the CUDA FP16 recipe for discovery.
Qwen-Qwen3-0.6B/cuda/README.md	Documents how to run the CUDA recipe.
Qwen-Qwen3-0.6B/cuda/Qwen-Qwen3-0.6B_cuda_fp16.json	Olive config to export/build FP16 on CUDA EP.
Qwen-Qwen3-0.6B/webgpu/info.yaml	Registers the WebGPU FP16 recipe for discovery.
Qwen-Qwen3-0.6B/webgpu/README.md	Documents how to run the WebGPU recipe.
Qwen-Qwen3-0.6B/webgpu/Qwen-Qwen3-0.6B_webgpu_fp16.json	Olive config to export/build FP16 on WebGPU EP.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Qwen-Qwen3-0.6B/cpu/README.md

Qwen-Qwen3-0.6B/cuda/Qwen-Qwen3-0.6B_cuda_fp16.json

Qwen-Qwen3-0.6B/webgpu/Qwen-Qwen3-0.6B_webgpu_fp16.json

Qwen-Qwen3-0.6B/cuda/README.md

Add recipes for Qwen-3 0.6B only

9905865

Copilot AI review requested due to automatic review settings February 9, 2026 20:26

Copilot started reviewing on behalf of kunal-vaishnavi February 9, 2026 20:27 View session

Copilot AI reviewed Feb 9, 2026

View reviewed changes

Only keep Qwen-3 0.6B CPU recipe

ceb5c1e

kunal-vaishnavi changed the title ~~Add Qwen-3 0.6B recipes~~ Add Qwen-3 0.6B CPU recipe Feb 10, 2026

Show installing Olive from main branch in README

559afc4

jambayk approved these changes Feb 10, 2026

View reviewed changes

jambayk merged commit 11dc8c5 into main Feb 10, 2026
7 checks passed

jambayk deleted the kvaishnavi/qwen3-0.6b branch February 10, 2026 23:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add Qwen-3 0.6B CPU recipe#233

Add Qwen-3 0.6B CPU recipe#233
jambayk merged 3 commits intomainfrom
kvaishnavi/qwen3-0.6b

kunal-vaishnavi commented Feb 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

kunal-vaishnavi commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kunal-vaishnavi commented Feb 9, 2026 •

edited

Loading