text-generation-inference/server/text_generation_server/layers/attention
drbh a87791d7c9 feat: add ruff and resolve issue (#2262)
* feat: add ruff and resolve issue

* fix: update client exports and adjust after rebase

* fix: adjust syntax to avoid circular import

* fix: adjust client ruff settings

* fix: lint and refactor import check and avoid model enum as global names

* fix: improve fbgemm_gpu check and lints

* fix: update lints

* fix: prefer comparing model enum over str

* fix: adjust lints and ignore specific rules

* fix: avoid unneeded quantize check
2024-09-25 05:46:24 +00:00
..
__init__.py feat: add ruff and resolve issue (#2262) 2024-09-25 05:46:24 +00:00
common.py Move to FlashDecoding instead of PagedAttention kernel. (#1940) 2024-09-24 03:58:13 +00:00
cuda.py feat: add ruff and resolve issue (#2262) 2024-09-25 05:46:24 +00:00
flash_attn_triton.py feat: add ruff and resolve issue (#2262) 2024-09-25 05:46:24 +00:00
ipex.py fix FlashDecoding change's regression in intel platform (#2161) 2024-09-24 03:58:13 +00:00
rocm.py feat: add ruff and resolve issue (#2262) 2024-09-25 05:46:24 +00:00