OlivierDehaene
|
e74bd41e0f
|
feat(server): add paged attention to flash models (#516)
Closes #478
|
2023-06-30 19:09:59 +02:00 |
|
OlivierDehaene
|
bd3a9d8e85
|
fix(router): add timeout on flume sends (#488)
|
2023-06-23 14:58:28 +02:00 |
|
OlivierDehaene
|
f59fb8b630
|
feat(router): add ngrok integration (#453)
|
2023-06-16 16:25:11 +02:00 |
|
OlivierDehaene
|
895c5f1562
|
feat(server): only compute prefill logprobs when asked (#406)
Close #288
|
2023-06-02 17:12:30 +02:00 |
|
Nicolas Patry
|
db2b4e0754
|
feat(router): new healthcheck that skips the queue (#244)
Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>
Co-authored-by: OlivierDehaene <olivier@huggingface.co>
|
2023-04-26 20:23:54 +02:00 |
|
Nicolas Patry
|
c4fb09f2ae
|
feat(router): add tests to validation (#237)
|
2023-04-26 16:14:40 +02:00 |
|
OlivierDehaene
|
ebc74d5666
|
feat(router): use number of tokens in batch as input for dynamic batching (#226)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2023-04-24 17:59:00 +02:00 |
|
OlivierDehaene
|
709d8936f6
|
feat(router): drop requests when client closes the channel (#202)
|
2023-04-20 11:07:40 +02:00 |
|
OlivierDehaene
|
9987960062
|
feat(router): make router input validation optional (#164)
|
2023-04-09 20:22:27 +02:00 |
|
OlivierDehaene
|
610bb1f978
|
feat(benchmark): tui based benchmarking tool (#149)
|
2023-03-30 15:26:27 +02:00 |
|
OlivierDehaene
|
b49dbf2d88
|
fix(server): use server tokenizer as gt (#128)
|
2023-03-16 12:12:26 +01:00 |
|
OlivierDehaene
|
1a2d68250a
|
feat: support typical sampling (#114)
closes #112
|
2023-03-09 11:33:57 +01:00 |
|
OlivierDehaene
|
cd5961b5da
|
feat: allow local models (#101)
closes #99
|
2023-03-06 14:39:36 +01:00 |
|
OlivierDehaene
|
9b8ea6a6c7
|
feat(server): add logits watermark (#90)
|
2023-03-02 12:30:41 +01:00 |
|
OlivierDehaene
|
439fcaf810
|
feat(router): add prometheus metrics scrape endpoint (#71)
|
2023-02-16 17:18:53 +01:00 |
|
OlivierDehaene
|
9af454142a
|
feat: add distributed tracing (#62)
|
2023-02-13 13:02:45 +01:00 |
|
OlivierDehaene
|
7b870e1e18
|
feat(router): use background task to manage request queue (#52)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2023-02-02 14:59:27 +01:00 |
|