text-generation-inference/server/text_generation/models
Nick Hill 31d76e238d
fix(batching): Avoid theoretical hang in batcher loop (#5)
- Avoid theoretical hang in batcher loop
- Avoid a couple of clones in the router generate method
- Keep attention mask tensors as integers
- Remove num_heads attribute

Co-authored-by: OlivierDehaene <Olivier.dehaene@gmail.com>
2022-12-05 10:10:59 +01:00
..
__init__.py feat(server): Support Galactica (#4) 2022-12-01 19:31:54 +01:00
bloom.py fix(batching): Avoid theoretical hang in batcher loop (#5) 2022-12-05 10:10:59 +01:00
causal_lm.py fix(batching): Avoid theoretical hang in batcher loop (#5) 2022-12-05 10:10:59 +01:00
galactica.py fix(batching): Avoid theoretical hang in batcher loop (#5) 2022-12-05 10:10:59 +01:00
model.py fix(batching): Avoid theoretical hang in batcher loop (#5) 2022-12-05 10:10:59 +01:00
seq2seq_lm.py fix(batching): Avoid theoretical hang in batcher loop (#5) 2022-12-05 10:10:59 +01:00
types.py feat(server): Support AutoModelForSeq2SeqLM 2022-11-04 18:03:04 +01:00