text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-04-27 04:52:07 +00:00

History

Daniël de Kok ccddb30c02 Fix cache block size for flash decoding (#2351 ) * Fix cache block size for flash decoding This seems to have been accidentally dropped during the TRT-LLM PR rebase. * Also run CI on changes to `backends`		2024-09-25 05:55:39 +00:00
..
ISSUE_TEMPLATE	chore: add pre-commit (#1569 )	2024-04-24 15:32:02 +03:00
workflows	Fix cache block size for flash decoding (#2351 )	2024-09-25 05:55:39 +00:00
PULL_REQUEST_TEMPLATE.md	chore(github): add templates (#264 )	2023-05-02 15:43:19 +02:00