.. |
merges
|
feat: add ruff and resolve issue (#2262)
|
2024-09-25 05:46:24 +00:00 |
__init__.py
|
Make Gaudi adapt to the tgi 2.3.0
|
2024-09-26 06:04:55 +00:00 |
adapter.py
|
Micro cleanup. (#2555)
|
2024-10-25 08:53:47 +00:00 |
chunks.py
|
server: use chunked inputs
|
2024-09-24 03:42:29 +00:00 |
convert.py
|
Force weights_only (before fully breaking pickle files anyway). (#1710)
|
2024-04-25 15:10:53 +03:00 |
debug.py
|
Make Gaudi adapt to the tgi 2.3.0
|
2024-09-26 06:04:55 +00:00 |
dist.py
|
Make Gaudi adapt to the tgi 2.3.0
|
2024-09-26 06:04:55 +00:00 |
hub.py
|
Micro cleanup. (#2555)
|
2024-10-25 08:53:47 +00:00 |
import_utils.py
|
Pr 2337 ci branch (#2379)
|
2024-09-25 05:55:39 +00:00 |
log.py
|
feat(fp8): use fbgemm kernels and load fp8 weights directly (#2248)
|
2024-09-25 05:30:41 +00:00 |
logits_process.py
|
Make Gaudi adapt to the tgi 2.3.0
|
2024-09-26 06:04:55 +00:00 |
peft.py
|
feat: add ruff and resolve issue (#2262)
|
2024-09-25 05:46:24 +00:00 |
quantization.py
|
Handle GPTQ-Marlin loading in GPTQMarlinWeightLoader (#2300)
|
2024-09-25 05:55:39 +00:00 |
segments.py
|
Enable multiple LoRa adapters (#2010)
|
2024-09-24 03:55:04 +00:00 |
sgmv.py
|
fix: allocate tmp based on sgmv kernel if available (#2345)
|
2024-09-25 06:06:17 +00:00 |
speculate.py
|
chore: formatting
|
2024-04-18 16:26:00 +03:00 |
tokens.py
|
Merge branch 'habana-main' into 2.3.0
|
2024-10-23 16:32:12 +08:00 |
version.py
|
Make Gaudi adapt to the tgi 2.3.0
|
2024-09-26 06:04:55 +00:00 |
watermark.py
|
Make Gaudi adapt to the tgi 2.3.0
|
2024-09-26 06:04:55 +00:00 |
weights.py
|
Move to moe-kernels package and switch to common MoE layer (#2511)
|
2024-09-25 06:18:05 +00:00 |