mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-10-11 07:55:24 +00:00
The minimum batch size logic could cause prefix blocks to be deallocated without prefill. The next allocation of the same prefix would then use garbage blocks. |
||
---|---|---|
.. | ||
client | ||
backend.rs | ||
block_allocator.rs | ||
lib.rs | ||
main.rs | ||
queue.rs | ||
radix.rs |