text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-06-09 10:52:07 +00:00

History

Daniël de Kok 22fb1be588 Fix cache block size for flash decoding (#2351 ) * Fix cache block size for flash decoding This seems to have been accidentally dropped during the TRT-LLM PR rebase. * Also run CI on changes to `backends`		2024-08-01 15:38:57 +02:00
..
client	Rebase TRT-llm (#2331 )	2024-07-31 10:33:10 +02:00
backend.rs	Fix cache block size for flash decoding (#2351 )	2024-08-01 15:38:57 +02:00
block_allocator.rs	Rebase TRT-llm (#2331 )	2024-07-31 10:33:10 +02:00
lib.rs	Rebase TRT-llm (#2331 )	2024-07-31 10:33:10 +02:00
main.rs	refactor usage stats (#2339 )	2024-07-31 16:29:07 +02:00
queue.rs	Rebase TRT-llm (#2331 )	2024-07-31 10:33:10 +02:00