text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git synced 2025-10-09 06:55:24 +00:00

History

Nick Hill 31d76e238d fix(batching): Avoid theoretical hang in batcher loop (#5 ) - Avoid theoretical hang in batcher loop - Avoid a couple of clones in the router generate method - Keep attention mask tensors as integers - Remove num_heads attribute Co-authored-by: OlivierDehaene <Olivier.dehaene@gmail.com>		2022-12-05 10:10:59 +01:00
..
models	fix(batching): Avoid theoretical hang in batcher loop (#5 )	2022-12-05 10:10:59 +01:00
pb	feat(server): Support all AutoModelForCausalLM on a best effort basis	2022-10-28 19:24:00 +02:00
__init__.py	feat(server): Support all AutoModelForCausalLM on a best effort basis	2022-10-28 19:24:00 +02:00
cache.py	feat(server): Support AutoModelForSeq2SeqLM	2022-11-04 18:03:04 +01:00
cli.py	feat(server): Support all AutoModelForCausalLM on a best effort basis	2022-10-28 19:24:00 +02:00
server.py	feat(server): Support AutoModelForSeq2SeqLM	2022-11-04 18:03:04 +01:00
utils.py	feat(server): Support Galactica (#4 )	2022-12-01 19:31:54 +01:00