From 0e09eeacfce54ef0c40f790b6909b5fd3f2f77e0 Mon Sep 17 00:00:00 2001
From: Vaibhav Srivastav <vaibhavs10@gmail.com>
Date: Wed, 14 Aug 2024 13:11:25 +0200
Subject: [PATCH] Doc review from Nico. x2

---
 docs/source/basic_tutorials/consuming_tgi.md | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/docs/source/basic_tutorials/consuming_tgi.md b/docs/source/basic_tutorials/consuming_tgi.md
index 81b3a8bf..6e4ec49c 100644
--- a/docs/source/basic_tutorials/consuming_tgi.md
+++ b/docs/source/basic_tutorials/consuming_tgi.md
@@ -31,6 +31,20 @@ curl localhost:8080/v1/chat/completions \
     -H 'Content-Type: application/json'
 ```
 
+For non-chat use-cases, you can also use the `/generate` and `/generate_stream` routes.
+
+```bash
+curl 127.0.0.1:8080/generate \
+    -X POST \
+    -d '{
+  "inputs":"What is Deep Learning?",
+  "parameters":{
+    "max_new_tokens":20
+  }
+}' \
+    -H 'Content-Type: application/json'
+```
+
 ## Python
 
 ### Inference Client
@@ -46,11 +60,9 @@ pip install huggingface_hub
 You can now use `InferenceClient` the exact same way you would use `OpenAI` client in Python
 
 ```python
-- from openai import OpenAI
-+ from huggingface_hub import InferenceClient
+from huggingface_hub import InferenceClient
 
-- client = OpenAI(
-+ client = InferenceClient(
+client = InferenceClient(
     base_url="http://localhost:8080/v1/",
 )