awq/quantize
|
feat: format code (#1070)
|
2023-09-27 12:22:09 +02:00 |
gptq
|
Exllama v2 (#1211)
|
2023-11-25 22:38:38 +01:00 |
__init__.py
|
Make tokenizer optional (#12)
|
2024-01-19 15:12:04 +01:00 |
convert.py
|
fit for baichuan models (#981)
|
2023-09-08 16:51:34 +02:00 |
debug.py
|
Log exceptions to debug.log (#52) (#84)
|
2024-02-29 09:14:42 +01:00 |
dist.py
|
Add changes from Optimum Habana's TGI folder
|
2023-12-05 11:12:16 +01:00 |
flash_attn.py
|
Add RoCm support (#1243)
|
2023-11-27 14:08:12 +01:00 |
import_utils.py
|
Add RoCm support (#1243)
|
2023-11-27 14:08:12 +01:00 |
layers.py
|
Add RoCm support (#1243)
|
2023-11-27 14:08:12 +01:00 |
paged_attention.py
|
feat: paged attention v2 (#1183)
|
2023-10-23 12:29:25 +02:00 |
tokens.py
|
Remove redundant fill op (#83) (#90)
|
2024-03-01 01:32:02 +01:00 |
weights.py
|
Exllama v2 (#1211)
|
2023-11-25 22:38:38 +01:00 |