text-generation-inference/router
Nick Hill 31d76e238d
fix(batching): Avoid theoretical hang in batcher loop (#5)
- Avoid theoretical hang in batcher loop
- Avoid a couple of clones in the router generate method
- Keep attention mask tensors as integers
- Remove num_heads attribute

Co-authored-by: OlivierDehaene <Olivier.dehaene@gmail.com>
2022-12-05 10:10:59 +01:00
..
client feat(server): Support all AutoModelForCausalLM on a best effort basis 2022-10-28 19:24:00 +02:00
src fix(batching): Avoid theoretical hang in batcher loop (#5) 2022-12-05 10:10:59 +01:00
Cargo.toml feat: Use json formatter by default in docker image 2022-11-02 17:29:56 +01:00