mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-09-11 12:24:53 +00:00
feat: add ie update to message docs
This commit is contained in:
parent
0e97af456a
commit
02912ad273
@ -4,6 +4,15 @@ Text Generation Inference (TGI) now supports the Messages API, which is fully co
|
|||||||
|
|
||||||
> **Note:** The Messages API is supported from TGI version 1.4.0 and above. Ensure you are using a compatible version to access this feature.
|
> **Note:** The Messages API is supported from TGI version 1.4.0 and above. Ensure you are using a compatible version to access this feature.
|
||||||
|
|
||||||
|
#### Table of Contents
|
||||||
|
|
||||||
|
- [Making a Request](#making-a-request)
|
||||||
|
- [Streaming](#streaming)
|
||||||
|
- [Synchronous](#synchronous)
|
||||||
|
- [Hugging Face Inference Endpoints](#hugging-face-inference-endpoints)
|
||||||
|
- [Cloud Providers](#cloud-providers)
|
||||||
|
- [Amazon SageMaker](#amazon-sagemaker)
|
||||||
|
|
||||||
## Making a Request
|
## Making a Request
|
||||||
|
|
||||||
You can make a request to TGI's Messages API using `curl`. Here's an example:
|
You can make a request to TGI's Messages API using `curl`. Here's an example:
|
||||||
@ -81,6 +90,37 @@ chat_completion = client.chat.completions.create(
|
|||||||
print(chat_completion)
|
print(chat_completion)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Hugging Face Inference Endpoints
|
||||||
|
|
||||||
|
TGI is now integrated with [Inference Endpoints](https://huggingface.co/inference-endpoints/dedicated) and can be easily accessed with only a few lines of code. Here's an example of how to use IE with TGI using OpenAI's Python client library:
|
||||||
|
|
||||||
|
> **Note:** Make sure to replace `base_url` with your endpoint URL and to include `v1/` at the end of the URL. The `api_key` should be replaced with your Hugging Face API key.
|
||||||
|
|
||||||
|
```python
|
||||||
|
from openai import OpenAI
|
||||||
|
|
||||||
|
# init the client but point it to TGI
|
||||||
|
client = OpenAI(
|
||||||
|
# replace with your endpoint url, make sure to include "v1/" at the end
|
||||||
|
base_url="https://vlzz10eq3fol3429.us-east-1.aws.endpoints.huggingface.cloud/v1/",
|
||||||
|
# replace with your API key
|
||||||
|
api_key="hf_XXX"
|
||||||
|
)
|
||||||
|
|
||||||
|
chat_completion = client.chat.completions.create(
|
||||||
|
model="tgi",
|
||||||
|
messages=[
|
||||||
|
{"role": "system", "content": "You are a helpful assistant." },
|
||||||
|
{"role": "user", "content": "What is deep learning?"}
|
||||||
|
],
|
||||||
|
stream=True
|
||||||
|
)
|
||||||
|
|
||||||
|
# iterate and print stream
|
||||||
|
for message in chat_completion:
|
||||||
|
print(message.choices[0].delta.content, end="")
|
||||||
|
```
|
||||||
|
|
||||||
## Cloud Providers
|
## Cloud Providers
|
||||||
|
|
||||||
TGI can be deployed on various cloud providers for scalable and robust text generation. One such provider is Amazon SageMaker, which has recently added support for TGI. Here's how you can deploy TGI on Amazon SageMaker:
|
TGI can be deployed on various cloud providers for scalable and robust text generation. One such provider is Amazon SageMaker, which has recently added support for TGI. Here's how you can deploy TGI on Amazon SageMaker:
|
||||||
|
Loading…
Reference in New Issue
Block a user