.. |
attention
|
Upgrade to new vllm extension ops for Gaudi backend (fix issue in exponential bucketing) (#3239)
|
2025-05-22 15:29:16 +02:00 |
awq
|
Gaudi: clean cuda/rocm code in hpu backend, enable flat_hpu (#3113)
|
2025-04-14 15:58:13 +02:00 |
gptq
|
Deepseek R1 for Gaudi backend (#3211)
|
2025-05-19 16:36:39 +02:00 |
moe
|
Deepseek R1 for Gaudi backend (#3211)
|
2025-05-19 16:36:39 +02:00 |
__init__.py
|
Deepseek R1 for Gaudi backend (#3211)
|
2025-05-19 16:36:39 +02:00 |
bnb.py
|
Add Gaudi Backend (#3055)
|
2025-02-28 12:14:58 +01:00 |
conv.py
|
Add Gaudi Backend (#3055)
|
2025-02-28 12:14:58 +01:00 |
exl2.py
|
Add Gaudi Backend (#3055)
|
2025-02-28 12:14:58 +01:00 |
fp8.py
|
Deepseek R1 for Gaudi backend (#3211)
|
2025-05-19 16:36:39 +02:00 |
layernorm.py
|
Deepseek R1 for Gaudi backend (#3211)
|
2025-05-19 16:36:39 +02:00 |
linear.py
|
Gaudi: clean cuda/rocm code in hpu backend, enable flat_hpu (#3113)
|
2025-04-14 15:58:13 +02:00 |
lora.py
|
Add Gaudi Backend (#3055)
|
2025-02-28 12:14:58 +01:00 |
medusa.py
|
Add Gaudi Backend (#3055)
|
2025-02-28 12:14:58 +01:00 |
mlp.py
|
Add Gaudi Backend (#3055)
|
2025-02-28 12:14:58 +01:00 |
rotary.py
|
Deepseek R1 for Gaudi backend (#3211)
|
2025-05-19 16:36:39 +02:00 |
speculative.py
|
Add Gaudi Backend (#3055)
|
2025-02-28 12:14:58 +01:00 |
tensor_parallel.py
|
Gaudi: clean cuda/rocm code in hpu backend, enable flat_hpu (#3113)
|
2025-04-14 15:58:13 +02:00 |