OlivierDehaene
c3779fa859
remove profiling
2023-04-06 17:58:54 +02:00
OlivierDehaene
26fc232afb
fix tp
2023-04-06 17:27:32 +02:00
OlivierDehaene
7816a47697
fix llama tokenizer
2023-04-06 17:07:58 +02:00
OlivierDehaene
3c272aefc0
fix test
2023-04-06 15:03:50 +02:00
OlivierDehaene
6c96f37bc8
fix tests
2023-04-06 14:44:06 +02:00
OlivierDehaene
c7dd00ead2
upgrade setuptools
2023-04-06 14:20:10 +02:00
OlivierDehaene
01ab5df180
update transformers
2023-04-06 13:45:08 +02:00
OlivierDehaene
70637b4170
use all tokens
2023-04-06 13:45:08 +02:00
OlivierDehaene
b5233f9c3c
better decode
2023-04-06 13:45:08 +02:00
OlivierDehaene
783bc64f47
fix concatenate
2023-04-06 13:45:08 +02:00
OlivierDehaene
c11e77411f
improve decode
2023-04-06 13:45:08 +02:00
OlivierDehaene
cdc33ce63c
allow disabling hf_transfer
2023-04-06 13:45:08 +02:00
OlivierDehaene
eb033e781f
trigger build
2023-04-06 13:45:08 +02:00
OlivierDehaene
8604d37015
trigger build
2023-04-06 13:45:08 +02:00
OlivierDehaene
f9b09d9629
hack
2023-04-06 13:45:08 +02:00
OlivierDehaene
30148b776b
fix instrumentation
2023-04-06 13:45:08 +02:00
OlivierDehaene
161e93a45f
cleanup
2023-04-06 13:45:08 +02:00
OlivierDehaene
1dd2c24b9c
rework validation
2023-04-06 13:45:08 +02:00
OlivierDehaene
47e93409f3
optional rust validation
2023-04-06 13:45:08 +02:00
OlivierDehaene
45eacb782d
patch qkv_rot
2023-04-06 13:45:08 +02:00
OlivierDehaene
cd5d0a96ba
feat(server): add flash attention llama
2023-04-06 13:45:08 +02:00
OlivierDehaene
71402ed4c7
wip
2023-04-06 13:45:08 +02:00
OlivierDehaene
3f2542bb6a
fix(server): fix escape characters in stop sequence ( #155 )
2023-04-05 19:37:41 +02:00
Guspan Tanadi
9122e7bd9c
docs(readme): provide link Logits Warper README ( #154 )
2023-04-04 13:27:46 +02:00
OlivierDehaene
c0aeb32583
feat(server): flash santacoder ( #153 )
2023-04-03 19:06:42 +02:00
OlivierDehaene
fef1a1c381
v0.4.3 ( #152 )
2023-03-30 17:28:14 +02:00
OlivierDehaene
84722f3e33
v0.4.2 ( #151 )
2023-03-30 17:10:01 +02:00
OlivierDehaene
08b7e4a282
fix(server): fix flash neox rotary embeddings ( #150 )
2023-03-30 16:12:23 +02:00
OlivierDehaene
610bb1f978
feat(benchmark): tui based benchmarking tool ( #149 )
2023-03-30 15:26:27 +02:00
OlivierDehaene
55106ec476
fix(ci): fix sagemaker action ( #148 )
2023-03-29 22:27:01 +02:00
OlivierDehaene
d503e8f09d
feat: aws sagemaker compatible image ( #147 )
...
The only difference is that now it pushes to
registry.internal.huggingface.tech/api-inference/community/text-generation-inference/sagemaker:...
instead of
registry.internal.huggingface.tech/api-inference/community/text-generation-inference:sagemaker-...
---------
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
2023-03-29 21:38:30 +02:00
OlivierDehaene
c9bdaa8b73
feat(server): reduce mlp and attn in one op for flash neox ( #145 )
2023-03-28 16:51:41 +02:00
OlivierDehaene
f000068944
feat(server): clear cache on error ( #143 )
2023-03-28 11:29:35 +02:00
Nick Hill
8e8dd984d8
feat(server): Add mypy-protobuf ( #141 )
...
Generates .pyi files for protobuf stubs which provide strong typing
information. Very helpful for IDE auto-completion, etc.
2023-03-27 09:25:15 +02:00
Nick Hill
462530c2b0
fix(server): Avoid using try/except to determine kind of AutoModel ( #142 )
2023-03-27 09:23:22 +02:00
OlivierDehaene
ab5fd8cf93
v0.4.1 ( #140 )
2023-03-26 16:37:51 +02:00
OlivierDehaene
678b2f3900
feat(server): cleanup flash neox loading ( #139 )
2023-03-26 16:37:21 +02:00
OlivierDehaene
d6a93fe992
fix(server): fix flash-neox scores warping ( #137 )
2023-03-24 18:21:41 +01:00
OlivierDehaene
05e9a796cc
feat(server): flash neoX ( #133 )
2023-03-24 14:02:14 +01:00
OlivierDehaene
23e1028822
feat(python-client): add CI ( #136 )
2023-03-23 18:13:04 +01:00
OlivierDehaene
5d04525cb9
feat(python-client): release v0.4.0 ( #135 )
2023-03-23 18:07:20 +01:00
lewtun
5e5e9d4bbd
feat: Add note about NVIDIA drivers ( #64 )
...
Co-authored-by: OlivierDehaene <olivier@huggingface.co>
2023-03-23 18:03:45 +01:00
OlivierDehaene
603e20b5f7
feat(ci): add ci paths ( #134 )
2023-03-23 18:01:30 +01:00
dconathan
7850119055
feat(python-client): add cookies to Client constructors and requests ( #132 )
...
I have a use case where we need to pass cookies (for auth reasons) to an
internally hosted server.
Note: I couldn't get the client tests to pass - do you need to have an
HF token?
```python
FAILED tests/test_client.py::test_generate - text_generation.errors.BadRequestError: Authorization header is correct, but the token seems invalid
```
2023-03-23 18:01:01 +01:00
OlivierDehaene
a3b7db932f
fix(python-client): relax dependencies ( #129 )
2023-03-16 12:57:07 +01:00
OlivierDehaene
b49dbf2d88
fix(server): use server tokenizer as gt ( #128 )
2023-03-16 12:12:26 +01:00
OlivierDehaene
8ad60b752f
fix(server): add position ids to neox ( #126 )
2023-03-15 13:12:49 +01:00
OlivierDehaene
cbd36aa4d1
fix(server): revert gpt-neox optims ( #123 )
2023-03-13 22:57:08 +01:00
OlivierDehaene
6860ce9c67
feat: add OpenAssistant/oasst-sft-1-pythia-12b to the list of supported models ( #122 )
...
…ed models
2023-03-13 20:42:10 +01:00
OlivierDehaene
411d6247f4
v0.4.0 ( #119 )
2023-03-09 16:07:01 +01:00