text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-04-25 20:12:07 +00:00

History

mrs303 da0f874d49 Prefer prefill instead of decode when max_waiting_tokens==0 (#18 )		2024-01-19 15:25:40 +01:00
..
health.rs	Rebased #617 (#868 )	2023-08-28 11:43:47 +02:00
infer.rs	Prefer prefill instead of decode when max_waiting_tokens==0 (#18 )	2024-01-19 15:25:40 +01:00
lib.rs	Exllama v2 (#1211 )	2023-11-25 22:38:38 +01:00
main.rs	Make tokenizer optional (#12 )	2024-01-19 15:12:04 +01:00
queue.rs	Control prefill and decode batch size separately (#6 )	2024-01-02 18:21:01 +01:00
server.rs	Control prefill and decode batch size separately (#6 )	2024-01-02 18:21:01 +01:00
validation.rs	fix: do not leak inputs on error (#1228 )	2023-11-20 10:33:44 +01:00