Update quantization.md

2025-09-10 20:04:52 +00:00 · 2023-08-24 13:57:00 +03:00 · 2023-08-24 13:57:00 +03:00 · 8f251c7c3a
commit 8f251c7c3a
parent 2363e9a482
1 changed files with 2 additions and 1 deletions
--- a/docs/source/conceptual/quantization.md
+++ b/docs/source/conceptual/quantization.md
@ -8,7 +8,8 @@ GPTQ is a post-training quantization method to make the model smaller. It quanti

 Given a layer \(l\) with weight matrix \(W_{l}\) and layer input \(X_{l}\), find quantized weight \(\hat{W}_{l}\):

-$$\text{\hat{W}{l}}^{*} = argmin{\hat{W_{l}}} |W_{l}X-\hat{W}{l}X|^{2}{2}\) \right\}$$
+$${\hat{W}{l}}^{*} = argmin{\hat{W_{l}}} |W_{l}X-\hat{W}{l}X|^{2}{2}) \}$$
+

 TGI allows you to both run an already GPTQ quantized model (see available models [here](https://huggingface.co/models?search=gptq)) or quantize a model of your choice using quantization script by simply passing --quantize like below 👇