luow-amd commited on
Commit
aa94c39
1 Parent(s): 1056c77
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -34,7 +34,7 @@ python3 quantize_quark.py \
34
  --multi_gpu
35
  ```
36
  ## Deployment
37
- Quark has its own export format and allows FP8 quantized models to be efficiently deployed using the vLLM backend(vllm-compatible).
38
 
39
  ## Evaluation
40
  Quark currently uses perplexity(PPL) as the evaluation metric for accuracy loss before and after quantization.The specific PPL algorithm can be referenced in the quantize_quark.py.
 
34
  --multi_gpu
35
  ```
36
  ## Deployment
37
+ Quark has its own export format and allows FP8 quantized models to be efficiently deployed using the vLLM backend(vLLM-compatible).
38
 
39
  ## Evaluation
40
  Quark currently uses perplexity(PPL) as the evaluation metric for accuracy loss before and after quantization.The specific PPL algorithm can be referenced in the quantize_quark.py.