models
|
Enable deferred token generation (#44) (#75)
|
2024-02-27 15:46:40 +01:00 |
pb
|
feat(server): clear cache on error (#143)
|
2023-03-28 11:29:35 +02:00 |
utils
|
Sequence bucketing for prefill (#39) (#67)
|
2024-02-23 01:52:14 +01:00 |
__init__.py
|
feat(clients): Python client (#103)
|
2023-03-07 18:52:22 +01:00 |
cli.py
|
Fix trust remote code (#55)
|
2024-02-19 07:53:24 +01:00 |
habana_quantization_env.py
|
Add Fp8 support (#42) (#71)
|
2024-02-23 11:52:28 +01:00 |
interceptor.py
|
Debugging utils (#14)
|
2024-01-15 21:05:27 +01:00 |
tracing.py
|
feat(clients): Python client (#103)
|
2023-03-07 18:52:22 +01:00 |