mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-04-21 23:12:07 +00:00
* Fix cache block size for flash decoding This seems to have been accidentally dropped during the TRT-LLM PR rebase. * Also run CI on changes to `backends` |
||
---|---|---|
.. | ||
client | ||
backend.rs | ||
block_allocator.rs | ||
lib.rs | ||
main.rs | ||
queue.rs |