mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-09-12 04:44:52 +00:00
Improve the Consuming TGI docs.
This commit is contained in:
parent
cd9b15d17f
commit
8de10acdcf
@ -819,13 +819,6 @@
|
|||||||
"example": "1.0",
|
"example": "1.0",
|
||||||
"nullable": true
|
"nullable": true
|
||||||
},
|
},
|
||||||
"guideline": {
|
|
||||||
"type": "string",
|
|
||||||
"description": "A guideline to be used in the chat_template",
|
|
||||||
"default": "null",
|
|
||||||
"example": "null",
|
|
||||||
"nullable": true
|
|
||||||
},
|
|
||||||
"logit_bias": {
|
"logit_bias": {
|
||||||
"type": "array",
|
"type": "array",
|
||||||
"items": {
|
"items": {
|
||||||
@ -1824,8 +1817,7 @@
|
|||||||
"type": "object",
|
"type": "object",
|
||||||
"required": [
|
"required": [
|
||||||
"finish_reason",
|
"finish_reason",
|
||||||
"generated_tokens",
|
"generated_tokens"
|
||||||
"input_length"
|
|
||||||
],
|
],
|
||||||
"properties": {
|
"properties": {
|
||||||
"finish_reason": {
|
"finish_reason": {
|
||||||
@ -1837,12 +1829,6 @@
|
|||||||
"example": 1,
|
"example": 1,
|
||||||
"minimum": 0
|
"minimum": 0
|
||||||
},
|
},
|
||||||
"input_length": {
|
|
||||||
"type": "integer",
|
|
||||||
"format": "int32",
|
|
||||||
"example": 1,
|
|
||||||
"minimum": 0
|
|
||||||
},
|
|
||||||
"seed": {
|
"seed": {
|
||||||
"type": "integer",
|
"type": "integer",
|
||||||
"format": "int64",
|
"format": "int64",
|
||||||
@ -2094,4 +2080,4 @@
|
|||||||
"description": "Hugging Face Text Generation Inference API"
|
"description": "Hugging Face Text Generation Inference API"
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
}
|
}
|
@ -1,18 +1,33 @@
|
|||||||
# Consuming Text Generation Inference
|
# Consuming Text Generation Inference
|
||||||
|
|
||||||
There are many ways you can consume Text Generation Inference server in your applications. After launching, you can use the `/generate` route and make a `POST` request to get results from the server. You can also use the `/generate_stream` route if you want TGI to return a stream of tokens. You can make the requests using the tool of your preference, such as curl, Python or TypeScrpt. For a final end-to-end experience, we also open-sourced ChatUI, a chat interface for open-source models.
|
There are many ways to consume Text Generation Inference (TGI) server in your applications. After launching the server, you can use the [Messages API](https://huggingface.co/docs/text-generation-inference/en/messages_api) `/v1/chat/completions` route and make a `POST` request to get results from the server. You can also pass `"stream": true` to the call if you want TGI to return a stream of tokens. You can make the requests using the tool of your preference, such as curl, Python or TypeScript. For a final end-to-end experience, we have also open-sourced ChatUI, a chat interface for open-source models.
|
||||||
|
|
||||||
## curl
|
## curl
|
||||||
|
|
||||||
After the launch, you can query the model using either the `/generate` or `/generate_stream` routes:
|
After a successful server launch, you can query the model using the `v1/chat/completions` route to get OpenAI Chat Completion API spec compliant responses:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
curl 127.0.0.1:8080/generate \
|
curl localhost:3000/v1/chat/completions \
|
||||||
-X POST \
|
-X POST \
|
||||||
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
|
-d '{
|
||||||
|
"model": "tgi",
|
||||||
|
"messages": [
|
||||||
|
{
|
||||||
|
"role": "system",
|
||||||
|
"content": "You are a helpful assistant."
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": "What is deep learning?"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"stream": true,
|
||||||
|
"max_tokens": 20
|
||||||
|
}' \
|
||||||
-H 'Content-Type: application/json'
|
-H 'Content-Type: application/json'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
You can update the `stream` parameter to `false` to get a non-streaming response.
|
||||||
|
|
||||||
## Inference Client
|
## Inference Client
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user