Commit Graph

7 Commits

Author SHA1 Message Date
OlivierDehaene
9ac7b7bc52 remove slots from grpc 2024-06-12 11:50:31 +02:00
OlivierDehaene
73c3903214 FlashCausalLM implem 2024-06-11 13:15:06 +02:00
OlivierDehaene
298bf31e69 add terminated_generations 2024-06-11 13:15:06 +02:00
OlivierDehaene
1cc86930a6 wip 2024-06-11 13:15:05 +02:00
OlivierDehaene
18e77a5cc7 wip 2024-06-11 13:15:05 +02:00
OlivierDehaene
8aece3bd68
feat: move allocation logic to rust (#1835)
Close #2007
2024-06-05 12:18:38 +02:00
OlivierDehaene
757223b352
feat: add SchedulerV3 (#1996)
- Refactor code to allow supporting multiple versions of the
generate.proto at the same time
- Add v3/generate.proto (ISO to generate.proto for now but allow for
future changes without impacting v2 backends)
- Add Schedule trait to abstract queuing and batching mechanisms that
will be different in the future
- Add SchedulerV2/V3 impl
2024-06-04 15:56:56 +02:00