add documentation for 4bit quantization options

2025-09-10 11:54:52 +00:00 · 2023-07-19 22:10:34 +00:00 · 2023-07-19 22:10:34 +00:00 · c52a5d4456
commit c52a5d4456
parent 4e0d8b2efb
2 changed files with 4 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -239,6 +239,8 @@ You can also quantize the weights with bitsandbytes to reduce the VRAM requireme
 make run-bloom-quantize # Requires 8xA100 40GB
 ```

+4bit quantization is available using the [NF4 and FP4 data types from bitsandbytes](https://arxiv.org/pdf/2305.14314.pdf). It can be enabled by providing `--quantize bitsandbytes-nf4` or `--quantize bitsandbytes-fp4` as a command line argument to `text-generation-launcher`.
+
 ## Develop

 ```shell
--- a/launcher/src/main.rs
+++ b/launcher/src/main.rs
@ -104,7 +104,8 @@ struct Args {
    num_shard: Option<usize>,

    /// Whether you want the model to be quantized. This will use `bitsandbytes` for
-    /// quantization on the fly, or `gptq`. 
+    /// quantization on the fly, or `gptq`. 4bit quantization is available through 
+    /// `bitsandbytes` by providing the `bitsandbytes-fp4` or `bitsandbytes-nf4` options.
    #[clap(long, env, value_enum)]
    quantize: Option<Quantization>,