text-generation-inference/backends/v3/src
Wang, Yi A 5d3653943c adjust block table in hpu to improve performance
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2025-03-16 20:28:01 -07:00
..
client Revert "feat: improve qwen2-vl startup " (#2924) 2025-01-17 12:09:05 -05:00
backend.rs Add backend name to telemetry (#2962) 2025-01-28 16:53:16 +01:00
block_allocator.rs adjust block table in hpu to improve performance 2025-03-16 20:28:01 -07:00
lib.rs Choosing input/total tokens automatically based on available VRAM? (#2673) 2024-10-28 04:59:49 +01:00
main.rs feat: add payload limit (#2726) 2024-11-21 18:20:15 +00:00
queue.rs Making tool_calls a vector. (#3075) 2025-03-05 22:32:31 +01:00
radix.rs Add property-based testing for RadixAllocator (#3068) 2025-03-04 15:09:46 +01:00