text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-10-21 04:45:23 +00:00

History

drbh dc5f05f8e6 Pr 3003 ci branch (#3007 ) * change ChatCompletionChunk to align with "OpenAI Chat Completions streaming API" Moving after tool_calls2 Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> add in Buffering.. Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> fix: handle usage outside of stream state and add tests Simplifying everything quite a bit. Remove the unused model_dump. Clippy. Clippy ? Ruff. Uppgrade the flake for latest transformers. Upgrade after rebase. Remove potential footgun. Fix completion test. * Clippy. * Tweak for multi prompt. * Ruff. * Update the snapshot a bit. --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>		2025-03-10 17:56:19 +01:00
..
infer	Fix tool call2 (#3076 )	2025-03-07 19:45:57 +01:00
config.rs	feat: add initial qwen2.5-vl model and test (#2971 )	2025-02-19 12:38:20 +01:00
kserve.rs	fix: include add_special_tokens in kserve request (#2859 )	2024-12-19 16:55:17 -05:00
lib.rs	Pr 3003 ci branch (#3007 )	2025-03-10 17:56:19 +01:00
logging.rs	Rebase TRT-llm (#2331 )	2024-07-31 10:33:10 +02:00
sagemaker.rs	feat: allow any supported payload on /invocations (#2683 )	2024-10-23 11:26:01 +00:00
server.rs	Pr 3003 ci branch (#3007 )	2025-03-10 17:56:19 +01:00
usage_stats.rs	feat: Add the parsing of HF_HUB_USER_AGENT_ORIGIN environment variable for telemetry (#3027 )	2025-02-19 21:09:12 +01:00
validation.rs	feat: add initial qwen2.5-vl model and test (#2971 )	2025-02-19 12:38:20 +01:00
vertex.rs	Improve tool call message processing (#3036 )	2025-02-21 10:30:29 +01:00