gptq
|
Refactor layers. (#1866)
|
2024-07-17 05:36:58 +00:00 |
debug.py
|
Add Habana copyright header (#122)
|
2024-04-08 18:06:21 +02:00 |
dist.py
|
add intel xpu support for TGI (#1475)
|
2024-06-10 13:16:45 +03:00 |
flash_attn_triton.py
|
MI300 compatibility (#1764)
|
2024-07-17 05:36:58 +00:00 |
flash_attn.py
|
reenable xpu for tgi (#1939)
|
2024-07-17 05:36:58 +00:00 |
hub.py
|
Fixing the download strategy for ibm-fms (#1917)
|
2024-07-17 05:36:58 +00:00 |
import_utils.py
|
reenable xpu for tgi (#1939)
|
2024-07-17 05:36:58 +00:00 |
log.py
|
v1.3.4
|
2024-04-22 09:08:34 +03:00 |
paged_attention.py
|
MI300 compatibility (#1764)
|
2024-07-17 05:36:58 +00:00 |
peft.py
|
fix: fix local loading for .bin models (#1419)
|
2024-04-22 09:17:52 +03:00 |
speculate.py
|
chore: formatting
|
2024-04-18 16:26:00 +03:00 |
version.py
|
Version check, doc fixes (#182)
|
2024-08-07 22:09:51 +02:00 |
weights.py
|
Refactor layers. (#1866)
|
2024-07-17 05:36:58 +00:00 |