Commit Graph

17 Commits

Author SHA1 Message Date
OlivierDehaene
f9d0ec376a
feat(docker): Make the image compatible with api-inference (#29) 2023-01-23 17:11:27 +01:00
OlivierDehaene
32a253063d
feat: Return logprobs (#8) 2022-12-15 17:03:56 +01:00
OlivierDehaene
718096f695
feat: Support stop sequences (#7) 2022-12-12 18:25:22 +01:00
Nick Hill
31d76e238d
fix(batching): Avoid theoretical hang in batcher loop (#5)
- Avoid theoretical hang in batcher loop
- Avoid a couple of clones in the router generate method
- Keep attention mask tensors as integers
- Remove num_heads attribute

Co-authored-by: OlivierDehaene <Olivier.dehaene@gmail.com>
2022-12-05 10:10:59 +01:00
OlivierDehaene
c5665f5c8b feat(server): Support generic AutoModelForCausalLM 2022-11-04 14:22:47 +01:00
OlivierDehaene
3cf6368c77 feat(server): Support all AutoModelForCausalLM on a best effort basis 2022-10-28 19:24:00 +02:00
OlivierDehaene
09674e6df9 feat(server): Support bitsandbytes 2022-10-27 14:25:29 +02:00
OlivierDehaene
c837893370 feat(router): Add max_waiting_tokens 2022-10-21 16:40:05 +02:00
Olivier Dehaene
f16f2f5ae1 v0.1.0 2022-10-20 19:14:44 +02:00
Olivier Dehaene
92c1ecd008 feat: Add arguments to CLI 2022-10-17 18:27:33 +02:00
Olivier Dehaene
5e5d8766a2 feat: Improve error handling 2022-10-17 14:59:00 +02:00
Olivier Dehaene
bcb53903b8 feat: Add AML deployment 2022-10-15 20:21:50 +02:00
Olivier Dehaene
bf99afe916 feat: Docker image 2022-10-14 15:56:21 +02:00
Olivier Dehaene
39df4d9975 Use axum 2022-10-11 18:14:39 +02:00
Olivier Dehaene
e86ecbac63 ValidationError was not correctly handled 2022-10-11 16:53:40 +02:00
Olivier Dehaene
4c693e6524 Refactored gRPC interface
Added validation logic
2022-10-11 16:50:54 +02:00
Olivier Dehaene
fa9a088467 Add load testing 2022-10-11 10:36:51 +02:00