mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-04-20 22:32:07 +00:00
* Update Quantization docs and minor doc fix. * update readme with latest quants info * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * up --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> |
||
---|---|---|
.. | ||
flash_attention.md | ||
guidance.md | ||
lora.md | ||
paged_attention.md | ||
quantization.md | ||
safetensors.md | ||
speculation.md | ||
streaming.md | ||
tensor_parallelism.md |