text-generation-inference/backends/v3/src
Daniël de Kok 22fb1be588
Fix cache block size for flash decoding (#2351)
* Fix cache block size for flash decoding

This seems to have been accidentally dropped during the TRT-LLM
PR rebase.

* Also run CI on changes to `backends`
2024-08-01 15:38:57 +02:00
..
client Rebase TRT-llm (#2331) 2024-07-31 10:33:10 +02:00
backend.rs Fix cache block size for flash decoding (#2351) 2024-08-01 15:38:57 +02:00
block_allocator.rs Rebase TRT-llm (#2331) 2024-07-31 10:33:10 +02:00
lib.rs Rebase TRT-llm (#2331) 2024-07-31 10:33:10 +02:00
main.rs refactor usage stats (#2339) 2024-07-31 16:29:07 +02:00
queue.rs Rebase TRT-llm (#2331) 2024-07-31 10:33:10 +02:00