From f66c9f340b9f39cfd5658ab1b1dc7312e8d61bf9 Mon Sep 17 00:00:00 2001 From: Nicolas Patry Date: Fri, 12 Apr 2024 12:09:23 +0000 Subject: [PATCH] Update the doc. --- docs/source/basic_tutorials/launcher.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/basic_tutorials/launcher.md b/docs/source/basic_tutorials/launcher.md index 69e58b20..d9b272db 100644 --- a/docs/source/basic_tutorials/launcher.md +++ b/docs/source/basic_tutorials/launcher.md @@ -168,7 +168,7 @@ Options: ## MAX_BATCH_PREFILL_TOKENS ```shell --max-batch-prefill-tokens - Limits the number of tokens for the prefill operation. Since this operation take the most memory and is compute bound, it is interesting to limit the number of requests that can be sent. Default to `max_input_length + 50` to give a bit of room + Limits the number of tokens for the prefill operation. Since this operation take the most memory and is compute bound, it is interesting to limit the number of requests that can be sent. Default to `max_input_tokens + 50` to give a bit of room [env: MAX_BATCH_PREFILL_TOKENS=]