mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-04-23 07:52:06 +00:00
The minimum batch size logic could cause prefix blocks to be deallocated without prefill. The next allocation of the same prefix would then use garbage blocks. |
||
---|---|---|
.. | ||
benches | ||
src | ||
build.rs | ||
Cargo.toml |