mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-10-20 12:25:23 +00:00
* Fix cache block size for flash decoding This seems to have been accidentally dropped during the TRT-LLM PR rebase. * Also run CI on changes to `backends` |
||
|---|---|---|
| .. | ||
| ISSUE_TEMPLATE | ||
| workflows | ||
| PULL_REQUEST_TEMPLATE.md | ||