Update consuming_tgi.md

2025-09-10 20:04:52 +00:00 · 2023-08-17 16:31:11 +03:00 · 2023-08-17 16:31:11 +03:00 · 452f8f3c2b
commit 452f8f3c2b
parent bce5e22444
1 changed files with 38 additions and 0 deletions
--- a/docs/source/basic_tutorials/consuming_tgi.md
+++ b/docs/source/basic_tutorials/consuming_tgi.md
@ -75,6 +75,44 @@ To serve both ChatUI and TGI in same environment, simply add your own endpoints

 ![ChatUI](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/chatui_screen.png)

+## Gradio
+
+Gradio has a `ChatInterface` class to create neat UIs for chatbots. Let's take a look at how to create a chatbot with streaming mode using TGI and Gradio. Assume you are serving your model on port 8080.
+
+```python
+import gradio as gr
+from huggingface_hub import InferenceClient
+
+# initialize InferenceClient
+client = InferenceClient(model="http://127.0.0.1:8080")
+
+# query client using streaming mode
+def inference(message, history):
+    partial_message = ""
+    for token in client.text_generation(message, max_new_tokens=20, stream=True):
+        partial_message += token
+        yield partial_message
+
+gr.ChatInterface(
+    inference,
+    chatbot=gr.Chatbot(height=300),
+    textbox=gr.Textbox(placeholder="Chat with me!", container=False, scale=7),
+    description="This is the demo for Gradio UI consuming TGI endpoint with Falcon model.",
+    title="Gradio 🤝 TGI",
+    examples=["Are tomatoes vegetables?"],
+    retry_btn=None,
+    undo_btn="Undo",
+    clear_btn="Clear",
+).queue().launch()
+```
+
+The UI looks like this 👇 
+
+![Gradio TGI](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/gradio-tgi.png)
+
+You can disable streaming mode using `return` instead of `yield` in your inference function.
+You can read more about how to customize a `ChatInterface` [here](https://www.gradio.app/guides/creating-a-chatbot-fast).
+
 ## API documentation

 You can consult the OpenAPI documentation of the `text-generation-inference` REST API using the `/docs` route. The Swagger UI is also available [here](https://huggingface.github.io/text-generation-inference).