text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-04-21 14:52:20 +00:00

History

Florian Zimmermeister b03d2621a7 add transformers gptq support (#963 ) Proposal to fix https://github.com/huggingface/text-generation-inference/issues/962		2023-09-07 10:19:42 +02:00
..
gptq	Fixing non 4bits quantization. (#785 )	2023-08-07 13:02:00 +02:00
__init__.py	feat(server): Add native support for PEFT Lora models (#762 )	2023-08-03 17:22:45 +02:00
convert.py	fix(server): blacklist local files (#609 )	2023-07-13 21:54:55 +02:00
dist.py	feat: add cuda memory fraction (#659 )	2023-07-24 11:43:58 +02:00
flash_attn.py	feat(server): flash attention v2 (#624 )	2023-07-18 16:21:18 +02:00
hub.py	feat(server): Adding new ignore_rule for conversion. (#485 )	2023-06-23 12:41:13 +02:00
layers.py	Disabling exllama on old compute. (#986 )	2023-09-06 15:01:00 +02:00
logits_process.py	fix(server): avoid errors for very small top_p values (#544 )	2023-07-04 20:11:33 +02:00
peft.py	feat(server): Add native support for PEFT Lora models (#762 )	2023-08-03 17:22:45 +02:00
tokens.py	Fixing top_k tokens when k ends up < 0 (#966 )	2023-09-01 00:22:03 +02:00
watermark.py	Fixing watermark. (#851 )	2023-08-16 07:17:26 +02:00
weights.py	add transformers gptq support (#963 )	2023-09-07 10:19:42 +02:00