text-generation-inference/router/client/src
2023-07-24 11:43:58 +02:00
..
pb Init 2022-10-08 12:30:12 +02:00
client.rs feat(server): auto max_batch_total_tokens for flash att models (#630) 2023-07-19 09:31:25 +02:00
lib.rs feat: decrease IPC proto size (#367) 2023-05-24 19:19:57 +02:00
sharded_client.rs feat: add cuda memory fraction (#659) 2023-07-24 11:43:58 +02:00