Desperate attempt to fix latex

This commit is contained in:
Merve Noyan 2023-08-24 12:20:40 +03:00 committed by GitHub
parent b581eb7151
commit 2363e9a482
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -8,7 +8,7 @@ GPTQ is a post-training quantization method to make the model smaller. It quanti
Given a layer \(l\) with weight matrix \(W_{l}\) and layer input \(X_{l}\), find quantized weight \(\hat{W}_{l}\): Given a layer \(l\) with weight matrix \(W_{l}\) and layer input \(X_{l}\), find quantized weight \(\hat{W}_{l}\):
\({\hat{W}{l}}^{*} = argmin{\hat{W_{l}}} |W_{l}X-\hat{W}{l}X|^{2}{2}\) $$\text{\hat{W}{l}}^{*} = argmin{\hat{W_{l}}} |W_{l}X-\hat{W}{l}X|^{2}{2}\) \right\}$$
TGI allows you to both run an already GPTQ quantized model (see available models [here](https://huggingface.co/models?search=gptq)) or quantize a model of your choice using quantization script by simply passing --quantize like below 👇 TGI allows you to both run an already GPTQ quantized model (see available models [here](https://huggingface.co/models?search=gptq)) or quantize a model of your choice using quantization script by simply passing --quantize like below 👇