mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-04-26 12:32:10 +00:00
* feat: tokenize each request individually and increase warmup image size * feat: adjust rotary embed and avoid cuda graphs of size 2 and smaller * fix: address image resize and rebase changes * feat: update to run qwen2-vl tests * fix: tweak param types |
||
---|---|---|
.. | ||
client | ||
backend.rs | ||
block_allocator.rs | ||
lib.rs | ||
main.rs | ||
queue.rs | ||
radix.rs |