From e07d0ebc06a7f50441683d2d1599e738a32e50ae Mon Sep 17 00:00:00 2001
From: drbh <david.richard.holtz@gmail.com>
Date: Mon, 29 Apr 2024 20:20:30 +0000
Subject: [PATCH] fix: rename header

---
 docs/source/basic_tutorials/visual_language_models.md | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/docs/source/basic_tutorials/visual_language_models.md b/docs/source/basic_tutorials/visual_language_models.md
index 20bf5759..e804ef09 100644
--- a/docs/source/basic_tutorials/visual_language_models.md
+++ b/docs/source/basic_tutorials/visual_language_models.md
@@ -1,4 +1,4 @@
-# Vision Language Models (VLM)
+# Vision Language Model Inference in TGI
 
 Visual Language Model (VLM) are models that consume both image and text inputs to generate text.
 
@@ -17,7 +17,7 @@ Below are couple of common use cases for vision language models:
 
 ### Hugging Face Hub Python Library
 
-To infer with vision language models through Python, you can use the [`huggingface_hub`](https://pypi.org/project/huggingface-hub/) library. The `InferenceClient` class provides a simple way to interact with the [Inference API](https://huggingface.co/docs/api-inference/index)
+To infer with vision language models through Python, you can use the [`huggingface_hub`](https://pypi.org/project/huggingface-hub/) library. The `InferenceClient` class provides a simple way to interact with the [Inference API](https://huggingface.co/docs/api-inference/index). Images can be passed as URLs or base64-encoded strings. The `InferenceClient` will automatically detect the image format.
 
 ```python
 from huggingface_hub import InferenceClient
@@ -31,8 +31,6 @@ for token in client.text_generation(prompt, max_new_tokens=16, stream=True):
 # This is a picture of an anthropomorphic rabbit in a space suit.
 ```
 
-Images can be passed as URLs or base64-encoded strings. The `InferenceClient` will automatically detect the image format.
-
 ```python
 from huggingface_hub import InferenceClient
 import base64