text-generation-inference/router
Nick Hill 3efa5bbbfd
fix(router): Include special tokens when tokenizing (#14)
There's currently a discrepancy in the tokenization between the router
and python server code. The latter includes special tokens but former
does not.

This results in a token count mismatch for seq2seq models such as mt0
where the tokenizer emits an EOS token at the end.

This in turn results in some unexpected/incorrect output, in particular
when batch concatenation is involved, because the python code uses the
input length passed from the router for each row.

As far as I can tell, it is better to include this token in the encoder
`input_ids`, so I guess it's best to just adjust on the router side.
2022-12-30 19:31:44 +01:00
..
client feat: Return logprobs (#8) 2022-12-15 17:03:56 +01:00
src fix(router): Include special tokens when tokenizing (#14) 2022-12-30 19:31:44 +01:00
Cargo.toml feat: Use json formatter by default in docker image 2022-11-02 17:29:56 +01:00