This is used to quantize the NN models using the TensorFlow2 based on the Vitis AI (documentation protal) .
This workflow has only been tested on hippeis (hippeis.pa.msu.edu).
The Vitis AI quantizer accepts a floating-point model as input and performs pre-processing (folds batchnorms and removes nodes not required for interference). It then quantizes the weihgts/biases and activations to the given bit width.
- Inspect the float model
VitisInspectoris a helper tool that inspects a float model, shows partition results for a given DPU target architecture, and indicates why the layeres are not mapped to DPU. - Use quantizer to quantize the model
Two approaches are avaiable: post-training quantization and quantization aware training
post-trianing quantization (PTQ): convert pre-trained floating-point model into a quantized one with little degradation in model accuracy. A representative dataset is required to run a few batches of inference on the floating-point model (i.e. quantization calibration).
quantization aware training (QAT): models the quantization error in both the forward and backward passes during model quantizaiton. - Fast finetuning (details to be added)
- Evaluate the quantized model
Same way to evaluate the float model. (Q: would the input need to be quantized?) - Dumping simulation resutls Compare the simulation results on CPU/GPU with the output values on the DPU
- Clone the repository by doing the following:
git clone git@github.com:bdongmd/vitis-workflow.git
- Start the Vitis AI docker and setup python virtual environment:
cd vitis-workflow
## change the path after -B to your own path
singularity exec -H `pwd` -B /home/bdong,/ssd/home/bdong/Xilinx docker://xilinx/vitis-ai-cpu:latest bash
## you can also replace the cpu docker to the gpu one docker://xilinx/vitis-ai-gpu:latest
conda activate vitis-ai-tensorflow2
pip install prettytable --user
- Run quantization
python quantize.py -c config/maria_model.json
The code is developed based on the examples from Vitis-AI tutorial, especially, 02-MNIST_classification_tf and 08-tf2_flow.