doc: clarify that --quantize is not needed for pre-quantized models

This commit is contained in:
Daniël de Kok 2024-09-19 14:12:49 +00:00
parent ce85efa968
commit ef7acd4452
3 changed files with 9 additions and 2 deletions

View File

@ -55,7 +55,9 @@ Options:
## QUANTIZE
```shell
--quantize <QUANTIZE>
Whether you want the model to be quantized
Quantization method to use for the model. It is not necessary to specify this option for pre-quantized models, since the quantization method is read from the model configuration.
Marlin kernels will be used automatically for GPTQ/AWQ models.
[env: QUANTIZE=]

View File

@ -149,6 +149,7 @@
pyright
pytest
pytest-asyncio
redocly
ruff
syrupy
]);

View File

@ -367,7 +367,11 @@ struct Args {
#[clap(long, env)]
num_shard: Option<usize>,
/// Whether you want the model to be quantized.
/// Quantization method to use for the model. It is not necessary to specify this option
/// for pre-quantized models, since the quantization method is read from the model
/// configuration.
///
/// Marlin kernels will be used automatically for GPTQ/AWQ models.
#[clap(long, env, value_enum)]
quantize: Option<Quantization>,