Small rearrangement in index page

2025-09-10 20:04:52 +00:00 · 2023-08-09 16:37:02 +02:00 · 2023-08-09 16:37:02 +02:00 · ec592d550e
commit ec592d550e
parent 862db1be0c
1 changed files with 3 additions and 4 deletions
--- a/docs/source/index.md
+++ b/docs/source/index.md
@ -1,8 +1,8 @@
 # Text Generation Inference

-![Text Generation Inference](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/TGI.png)
+Text Generation Inference (TGI) is a toolkit for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and T5.

-Text Generation Inference (TGI) is built for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and T5. 
+![Text Generation Inference](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/TGI.png)

 Text Generation Inference implements many optimizations and features, such as:

@ -11,7 +11,7 @@ Text Generation Inference implements many optimizations and features, such as:
 - Tensor Parallelism for faster inference on multiple GPUs
 - Token streaming using Server-Sent Events (SSE)
 - Continuous batching of incoming requests for increased total throughput
- Optimized transformers code for inference using [flash-attention](https://github.com/HazyResearch/flash-attention) and [Paged Attention](https://github.com/vllm-project/vllm) on the most popular architectures
+- Optimized transformers code for inference using [Flash Attention](https://github.com/HazyResearch/flash-attention) and [Paged Attention](https://github.com/vllm-project/vllm) on the most popular architectures
 - Quantization with [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) and [GPT-Q](https://arxiv.org/abs/2210.17323)
 - [Safetensors](https://github.com/huggingface/safetensors) weight loading
 - Watermarking with [A Watermark for Large Language Models](https://arxiv.org/abs/2301.10226)
@ -19,7 +19,6 @@ Text Generation Inference implements many optimizations and features, such as:
 - Stop sequences
 - Log probabilities

-TODO: Add something more visual here

 Text Generation Inference is used in production by multiple projects, such as: