From 0e09eeacfce54ef0c40f790b6909b5fd3f2f77e0 Mon Sep 17 00:00:00 2001 From: Vaibhav Srivastav Date: Wed, 14 Aug 2024 13:11:25 +0200 Subject: [PATCH] Doc review from Nico. x2 --- docs/source/basic_tutorials/consuming_tgi.md | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/docs/source/basic_tutorials/consuming_tgi.md b/docs/source/basic_tutorials/consuming_tgi.md index 81b3a8bf..6e4ec49c 100644 --- a/docs/source/basic_tutorials/consuming_tgi.md +++ b/docs/source/basic_tutorials/consuming_tgi.md @@ -31,6 +31,20 @@ curl localhost:8080/v1/chat/completions \ -H 'Content-Type: application/json' ``` +For non-chat use-cases, you can also use the `/generate` and `/generate_stream` routes. + +```bash +curl 127.0.0.1:8080/generate \ + -X POST \ + -d '{ + "inputs":"What is Deep Learning?", + "parameters":{ + "max_new_tokens":20 + } +}' \ + -H 'Content-Type: application/json' +``` + ## Python ### Inference Client @@ -46,11 +60,9 @@ pip install huggingface_hub You can now use `InferenceClient` the exact same way you would use `OpenAI` client in Python ```python -- from openai import OpenAI -+ from huggingface_hub import InferenceClient +from huggingface_hub import InferenceClient -- client = OpenAI( -+ client = InferenceClient( +client = InferenceClient( base_url="http://localhost:8080/v1/", )