text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-10-09 23:15:23 +00:00

History

yuanwu 67ee45a270 Pass the max_batch_total_tokens to causal_lm refine the warmup Signed-off-by: yuanwu <yuan.wu@intel.com>		2024-10-23 08:28:26 +00:00
..
client	Pass the max_batch_total_tokens to causal_lm	2024-10-23 08:28:26 +00:00
grpc-metadata	Rebase TRT-llm (#2331 )	2024-09-25 05:55:39 +00:00
trtllm	More fixes trtllm (#2342 )	2024-09-25 06:08:00 +00:00
v3	Pass the max_batch_total_tokens to causal_lm	2024-10-23 08:28:26 +00:00