text-generation-inference/server/text_generation_server/utils/gptq
2023-09-27 12:22:09 +02:00
..
custom_autotune.py feat(server): Add inference support for GPTQ (llama + falcon tested) + Quantization script (#438) 2023-06-26 12:27:01 +02:00
exllama.py Fix __call__ vs forward. (#993) 2023-09-07 17:36:30 +02:00
quant_linear.py Fixing non 4bits quantization. (#785) 2023-08-07 13:02:00 +02:00
quantize.py feat: format code (#1070) 2023-09-27 12:22:09 +02:00