text-generation-inference/backends/gaudi/server/text_generation_server/layers/attention
Wang, Yi 429dcd9c64
[gaudi] Gemma3 sliding window support (#3280)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2025-07-01 10:06:01 +02:00
..
__init__.py [gaudi] Perf optimization (#3256) 2025-06-11 15:00:21 +02:00
common.py [gaudi] Gemma3 sliding window support (#3280) 2025-07-01 10:06:01 +02:00
hpu.py [gaudi] Gemma3 sliding window support (#3280) 2025-07-01 10:06:01 +02:00
kv_cache.py Upgrade to new vllm extension ops for Gaudi backend (fix issue in exponential bucketing) (#3239) 2025-05-22 15:29:16 +02:00