gptq
|
Fixing non 4bits quantization. (#785)
|
2023-08-07 13:02:00 +02:00 |
convert.py
|
fix(server): blacklist local files (#609)
|
2023-07-13 21:54:55 +02:00 |
dist.py
|
feat: add cuda memory fraction (#659)
|
2023-07-24 11:43:58 +02:00 |
flash_attn.py
|
feat(server): flash attention v2 (#624)
|
2023-07-18 16:21:18 +02:00 |
layers.py
|
Disabling exllama on old compute. (#986)
|
2023-09-06 15:01:00 +02:00 |
tokens.py
|
Fixing top_k tokens when k ends up < 0 (#966)
|
2023-09-01 00:22:03 +02:00 |
watermark.py
|
Fixing watermark. (#851)
|
2023-08-16 07:17:26 +02:00 |
weights.py
|
add transformers gptq support (#963)
|
2023-09-07 10:19:42 +02:00 |