text-generation-inference/server/text_generation_server/quant
2023-06-06 11:56:10 +00:00
..
__init__.py [WIP] Adding GPTQ support for llama 2023-05-11 12:05:35 +00:00
custom_autotune.py Reducing number of reps while autotuning. 2023-06-06 11:56:10 +00:00
fused_attn.py [WIP] Adding GPTQ support for llama 2023-05-11 12:05:35 +00:00
fused_mlp.py [WIP] Adding GPTQ support for llama 2023-05-11 12:05:35 +00:00
quant_linear.py Working version. 2023-05-11 12:05:35 +00:00
quantizer.py [WIP] Adding GPTQ support for llama 2023-05-11 12:05:35 +00:00