attention
|
ROCm and sliding windows fixes (#2033)
|
2024-09-24 03:42:29 +00:00 |
awq
|
Refactor layers. (#1866)
|
2024-07-17 05:36:58 +00:00 |
gptq
|
Fix text-generation-server quantize (#2103)
|
2024-09-24 03:46:09 +00:00 |
__init__.py
|
MLPSpeculator. (#1865)
|
2024-07-17 05:36:58 +00:00 |
conv.py
|
Refactor layers. (#1866)
|
2024-07-17 05:36:58 +00:00 |
eetq.py
|
Refactor layers. (#1866)
|
2024-07-17 05:36:58 +00:00 |
exl2.py
|
Add support for exl2 quantization
|
2024-09-24 03:19:39 +00:00 |
fp8.py
|
Refactor layers. (#1866)
|
2024-07-17 05:36:58 +00:00 |
layernorm.py
|
MI300 compatibility (#1764)
|
2024-07-17 05:36:58 +00:00 |
linear.py
|
Add support for GPTQ Marlin (#2052)
|
2024-09-24 03:43:30 +00:00 |
marlin.py
|
Add support for GPTQ Marlin (#2052)
|
2024-09-24 03:43:30 +00:00 |
mlp.py
|
MLPSpeculator. (#1865)
|
2024-07-17 05:36:58 +00:00 |
rotary.py
|
fix(layers): fix SuRotaryEmbedding (#2060)
|
2024-09-24 03:42:29 +00:00 |
speculative.py
|
MLPSpeculator. (#1865)
|
2024-07-17 05:36:58 +00:00 |
tensor_parallel.py
|
Add Phi-3 medium support (#2039)
|
2024-09-24 03:42:29 +00:00 |