Update default doc.

2025-07-05 15:30:19 +00:00 · 2024-04-11 15:45:45 +00:00 · 2024-04-11 15:45:45 +00:00 · a4c86e8678
commit a4c86e8678
parent bd01d448d7
2 changed files with 2 additions and 2 deletions
--- a/docs/source/basic_tutorials/launcher.md
+++ b/docs/source/basic_tutorials/launcher.md
@ -149,7 +149,7 @@ Options:
 ## MAX_TOTAL_TOKENS
 ```shell
      --max-total-tokens <MAX_TOTAL_TOKENS>
-          This is the most important value to set as it defines the "memory budget" of running clients requests. Clients will send input sequences and ask to generate `max_new_tokens` on top. with a value of `1512` users can send either a prompt of `1000` and ask for `512` new tokens, or send a prompt of `1` and ask for `1511` max_new_tokens. The larger this value, the larger amount each request will be in your RAM and the less effective batching can be. Default to min(max_position_embeddings - 1, 16384)
+          This is the most important value to set as it defines the "memory budget" of running clients requests. Clients will send input sequences and ask to generate `max_new_tokens` on top. with a value of `1512` users can send either a prompt of `1000` and ask for `512` new tokens, or send a prompt of `1` and ask for `1511` max_new_tokens. The larger this value, the larger amount each request will be in your RAM and the less effective batching can be. Default to min(max_position_embeddings, 16384)
          
          [env: MAX_TOTAL_TOKENS=]

--- a/launcher/src/main.rs
+++ b/launcher/src/main.rs
@ -230,7 +230,7 @@ struct Args {
    /// `1511` max_new_tokens.
    /// The larger this value, the larger amount each request will be in your RAM
    /// and the less effective batching can be.
-    /// Default to min(max_position_embeddings - 1, 16384)
+    /// Default to min(max_position_embeddings, 16384)
    #[clap(long, env)]
    max_total_tokens: Option<usize>,