text-generation-inference/backends
Daniël de Kok ccddb30c02 Fix cache block size for flash decoding (#2351)
* Fix cache block size for flash decoding

This seems to have been accidentally dropped during the TRT-LLM
PR rebase.

* Also run CI on changes to `backends`
2024-09-25 05:55:39 +00:00
..
client Rebase TRT-llm (#2331) 2024-09-25 05:55:39 +00:00
grpc-metadata Rebase TRT-llm (#2331) 2024-09-25 05:55:39 +00:00
trtllm Rebase TRT-llm (#2331) 2024-09-25 05:55:39 +00:00
v3 Fix cache block size for flash decoding (#2351) 2024-09-25 05:55:39 +00:00