mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-09-10 20:04:52 +00:00
Fix inline latex
This commit is contained in:
parent
5f4dcd5a4b
commit
12d9a67752
@ -6,7 +6,7 @@ TGI offers GPTQ and bits-and-bytes quantization to quantize large language model
|
||||
|
||||
GPTQ is a post-training quantization method to make the model smaller. It quantizes each weight by finding a compressed version of that weight, that will yield a minimum mean squared error like below 👇
|
||||
|
||||
Given a layer \(l\) with weight matrix \(W_{l}\) and layer input \(X_{l}\), find quantized weight \(\hat{W}_{l}\):
|
||||
Given a layer \\(l\\) with weight matrix \\(W_{l}\\) and layer input \\(X_{l}\\), find quantized weight \\(\\hat{W}_{l}\\):
|
||||
|
||||
$$({\hat{W}_{l}}^{*} = argmin_{\hat{W_{l}}} ||W_{l}X-\hat{W}_{l}X||^{2}_{2})$$
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user