added a tokenizer to HeterogeneousNextTokenChooser

I'm not super happy with this solution - but it seemed most in line with the current way of passing in extra params into `from_pb` in `concatenate`.
2025-09-10 20:04:52 +00:00 · 2023-08-15 15:12:32 -07:00 · 2023-08-15 15:12:32 -07:00 · 25c48f5679
commit 25c48f5679
parent f65d703de4
1 changed files with 1 additions and 2 deletions
--- a/server/text_generation_server/models/flash_causal_lm.py
+++ b/server/text_generation_server/models/flash_causal_lm.py
@ -634,8 +634,7 @@ class FlashCausalLMBatch(Batch):
            next_token_chooser_parameters,
            dtype=batches[0].next_token_chooser.dtype,
            device=batches[0].next_token_chooser.device,
-            # todo - determine how to obtain access to a tokenizer here
-            tokenizer=...
+            tokenizer=batches[0].next_token_chooser.tokenizer
        )

        # Needed to avoid dropping blocks when the batches will go out of scope