From 3c9a840362645678f04b62bd2c16c67aedb29502 Mon Sep 17 00:00:00 2001 From: Vaibhav Srivastav Date: Wed, 7 Aug 2024 15:55:10 +0200 Subject: [PATCH] update readme with latest quants info --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index a88e0437..cf7f1d22 100644 --- a/README.md +++ b/README.md @@ -48,6 +48,8 @@ Text Generation Inference (TGI) is a toolkit for deploying and serving Large Lan - [GPT-Q](https://arxiv.org/abs/2210.17323) - [EETQ](https://github.com/NetEase-FuXi/EETQ) - [AWQ](https://github.com/casper-hansen/AutoAWQ) + - [Marlin](https://github.com/IST-DASLab/marlin) + - [fp8]() - [Safetensors](https://github.com/huggingface/safetensors) weight loading - Watermarking with [A Watermark for Large Language Models](https://arxiv.org/abs/2301.10226) - Logits warper (temperature scaling, top-p, top-k, repetition penalty, more details see [transformers.LogitsProcessor](https://huggingface.co/docs/transformers/internal/generation_utils#transformers.LogitsProcessor))