Commit Graph

13 Commits

Author SHA1 Message Date
OlivierDehaene
b6ee0ec7b0
feat(router): add git sha to info route (#208) 2023-04-19 21:36:59 +02:00
OlivierDehaene
a88c54bb4c
feat(server): check cuda capability when importing flash models (#201)
close #198
2023-04-19 12:52:37 +02:00
OlivierDehaene
e14ae3b5e9
feat(server): support quantization for flash models (#200)
closes #197
2023-04-19 12:51:11 +02:00
OlivierDehaene
7a1ba58557
fix(docker): fix docker image dependencies (#187) 2023-04-17 00:26:47 +02:00
OlivierDehaene
880a76eed5
feat(server): support sharded santacoder (#167) 2023-04-12 17:18:08 +02:00
OlivierDehaene
f26dfd0dc1
feat(server): support OPT models (#55)
OPT models do not all have a `tokenizer.json` file on the hub at the
moment. Can't merge for now.
2023-04-11 19:16:41 +02:00
OlivierDehaene
299217c95c
feat(server): add flash attention llama (#144) 2023-04-11 16:38:22 +02:00
OlivierDehaene
c0aeb32583
feat(server): flash santacoder (#153) 2023-04-03 19:06:42 +02:00
Nick Hill
462530c2b0
fix(server): Avoid using try/except to determine kind of AutoModel (#142) 2023-03-27 09:23:22 +02:00
OlivierDehaene
d6a93fe992
fix(server): fix flash-neox scores warping (#137) 2023-03-24 18:21:41 +01:00
OlivierDehaene
05e9a796cc
feat(server): flash neoX (#133) 2023-03-24 14:02:14 +01:00
OlivierDehaene
8ad60b752f
fix(server): add position ids to neox (#126) 2023-03-15 13:12:49 +01:00
OlivierDehaene
3fef90d50f
feat(clients): Python client (#103) 2023-03-07 18:52:22 +01:00