Commit Graph

8 Commits

Author SHA1 Message Date
OlivierDehaene
2ac1b55c95 v1.4.1 () 2024-04-24 15:42:59 +03:00
OlivierDehaene
31b5e37f49 chore: add pre-commit () 2024-04-24 15:32:02 +03:00
drbh
55acb86f42 Outlines guided generation ()
This WIP PR starts to add grammar support via outlines, currently this
PR supports very simple regex grammars and does not optimize for
precompiling or caching grammar fsm's.

todo:
- [X] add simple outlines guidance to `NextTokenChooser`
- [X] update protos for grammar
- [X] update generation params API
- [X] constrain simple grammar
- [ ] support parsing more complex grammar into fsm
- [ ] support all outline support grammar types
- [ ] explore optimizations to avoid recompiling grammars

guided request
```bash
curl -s 'http://localhost:3000/generate' \
--header 'Content-Type: application/json' \
--data-raw '{
    "inputs": "make an email for david: \n",
    "parameters": {
        "max_new_tokens": 6,
        "grammar": "[\\w-]+@([\\w-]+\\.)+[\\w-]+"
    }
}' | jq
```
response
```json
{
  "generated_text": "david@example.com"
}
```

unguided request
```bash
curl -s 'http://localhost:3000/generate' \
--header 'Content-Type: application/json' \
--data '{
    "inputs": "make an email for david: \n",
    "parameters": {
        "max_new_tokens": 6
    }
}' | jq
```
response
```json
{
  "generated_text": "    email = 'david"
}
```
2024-04-24 14:57:37 +03:00
OlivierDehaene
f1d8da3ba6 feat(server): add frequency penalty () 2024-04-24 08:43:50 +00:00
regisss
cc744ba426 Add changes from Optimum Habana's TGI folder 2023-12-05 11:12:16 +01:00
Nick Hill
e4b26aa10b
fix(server): avoid errors for very small top_p values ()
See https://github.com/huggingface/transformers/pull/24111

I didn't add validation to the `__init__` method since it's not done for
other values/warpers.
2023-07-04 20:11:33 +02:00
OlivierDehaene
53aa9194c8
fix(server): fix warpers on CPU ()
Closes 
2023-06-20 11:06:10 +02:00
OlivierDehaene
62f91f78ac
feat(server): support vectorized warpers in flash causal lm ()
Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com>
2023-05-26 12:30:27 +02:00