From 1d37a6a06a3f16cea1fdfa1d3dbf0c7030b7a5d9 Mon Sep 17 00:00:00 2001
From: Vaibhav Srivastav <vaibhavs10@gmail.com>
Date: Tue, 13 Aug 2024 16:17:32 +0200
Subject: [PATCH] add info about Open AI client.

---
 docs/source/basic_tutorials/consuming_tgi.md | 27 ++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/docs/source/basic_tutorials/consuming_tgi.md b/docs/source/basic_tutorials/consuming_tgi.md
index cda1f5ab..1a9a4e8d 100644
--- a/docs/source/basic_tutorials/consuming_tgi.md
+++ b/docs/source/basic_tutorials/consuming_tgi.md
@@ -29,6 +29,33 @@ curl localhost:3000/v1/chat/completions \
 
 You can update the `stream` parameter to `false` to get a non-streaming response.
 
+## OpenAI Client
+
+You can directly use the OpenAI Python/ JS client to interact with TGI.
+
+```python
+from openai import OpenAI
+
+# init the client but point it to TGI
+client = OpenAI(
+    base_url="http://localhost:3000/v1/",
+    api_key="-"
+)
+
+chat_completion = client.chat.completions.create(
+    model="tgi",
+    messages=[
+        {"role": "system", "content": "You are a helpful assistant." },
+        {"role": "user", "content": "What is deep learning?"}
+    ],
+    stream=True
+)
+
+# iterate and print stream
+for message in chat_completion:
+    print(message)
+```
+
 ## Inference Client
 
 [`huggingface-hub`](https://huggingface.co/docs/huggingface_hub/main/en/index) is a Python library to interact with the Hugging Face Hub, including its endpoints. It provides a nice high-level class, [`~huggingface_hub.InferenceClient`], which makes it easy to make calls to a TGI endpoint. `InferenceClient` also takes care of parameter validation and provides a simple to-use interface.