This implements the ONNX Runtime MatMulNBits operation using WebGPU. The WebGPU shaders are directly sourced from the ONNX Runtime project (version v1.22.0).
Before using or building this project, ensure you have the following prerequisites installed on your system:
Hareware: Basic: ADL(12th) CPU device. Recommend: LNL(15th) CPU device.
Software: Win10/Win11, CMake 3.16 or higher, Python3.x, Visual Studio.
-
Clone the Repository:
git clone -b matmulnbits-dev https://github.com/wenqinI/wgpu_compute_playground.git
-
Initialize/Update Submodules:
git submodule update --init
-
Build the Project:
cmake -S . -B build cmake --build build --config Release -j8 -
Run Tests:
build\wgpu\Release\matmulnbits.exe > result.txt python3 diff.py result-ref.txt result.txt output Elements
-
Run Benchmarking:
for %%x in (1 128 1024 2048 4096) do ( build\wgpu\Release\matmulnbits.exe -m %%x )
Contributions are welcome! If you have any ideas for improvements, new features, or bug fixes, feel free to open an issue or submit a pull request.
This project is licensed under the BSD 3-Clause "New" or "Revised" License. See the LICENSE file for more information.