text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-10-10 15:35:24 +00:00

History

Wang, Yi A 5d3653943c adjust block table in hpu to improve performance Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>		2025-03-16 20:28:01 -07:00
..
client	Revert "feat: improve qwen2-vl startup " (#2924 )	2025-01-17 12:09:05 -05:00
backend.rs	Add backend name to telemetry (#2962 )	2025-01-28 16:53:16 +01:00
block_allocator.rs	adjust block table in hpu to improve performance	2025-03-16 20:28:01 -07:00
lib.rs	Choosing input/total tokens automatically based on available VRAM? (#2673 )	2024-10-28 04:59:49 +01:00
main.rs	feat: add payload limit (#2726 )	2024-11-21 18:20:15 +00:00
queue.rs	Making `tool_calls` a vector. (#3075 )	2025-03-05 22:32:31 +01:00
radix.rs	Add property-based testing for `RadixAllocator` (#3068 )	2025-03-04 15:09:46 +01:00