text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-10-11 07:55:24 +00:00

History

Daniël de Kok c7b495f97d hotfix: avoid non-prefilled block use when using prefix caching (#2489 ) The minimum batch size logic could cause prefix blocks to be deallocated without prefill. The next allocation of the same prefix would then use garbage blocks.		2024-09-25 06:13:11 +00:00
..
client	Lots of improvements (Still 2 allocators) (#2449 )	2024-09-25 06:13:11 +00:00
backend.rs	hotfix: avoid non-prefilled block use when using prefix caching (#2489 )	2024-09-25 06:13:11 +00:00
block_allocator.rs	Lots of improvements (Still 2 allocators) (#2449 )	2024-09-25 06:13:11 +00:00
lib.rs	Keeping the benchmark somewhere (#2401 )	2024-09-25 06:05:43 +00:00
main.rs	Pr 2352 ci branch (#2382 )	2024-09-25 06:01:59 +00:00
queue.rs	Lots of improvements (Still 2 allocators) (#2449 )	2024-09-25 06:13:11 +00:00
radix.rs	Lots of improvements (Still 2 allocators) (#2449 )	2024-09-25 06:13:11 +00:00