text-generation-inference/backends
Yuan Wu 3d059f91ab
Gaudi: Use exponential growth to replace BATCH_BUCKET_SIZE (#3131)
* Gaudi: Use exponential growth to replace BATCH_BUCKET_SIZE

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Remove debug modifications

Signed-off-by: yuanwu <yuan.wu@intel.com>

---------

Signed-off-by: yuanwu <yuan.wu@intel.com>
2025-04-03 10:34:53 +02:00
..
client Revert "feat: improve qwen2-vl startup " (#2924) 2025-01-17 12:09:05 -05:00
gaudi Gaudi: Use exponential growth to replace BATCH_BUCKET_SIZE (#3131) 2025-04-03 10:34:53 +02:00
grpc-metadata Upgrading our rustc version. (#2908) 2025-01-15 17:04:03 +01:00
llamacpp Update the llamacpp backend (#3022) 2025-03-11 09:19:01 +01:00
neuron Update neuron backend (#3098) 2025-03-12 09:53:15 +01:00
trtllm feat: add support for HF_HUB_USER_AGENT_ORIGIN to add user-agent Origin field in Hub requests. (#3061) 2025-03-04 16:43:50 +01:00
v2 Add backend name to telemetry (#2962) 2025-01-28 16:53:16 +01:00
v3 Making tool_calls a vector. (#3075) 2025-03-05 22:32:31 +01:00