Added index.md and other initial files

2025-06-14 13:22:07 +00:00 · 2023-07-31 15:56:29 +03:00 · 2023-07-31 15:56:29 +03:00 · 41bd0e4af1
commit 41bd0e4af1
parent dc631b5be5
4 changed files with 14 additions and 0 deletions
--- a/docs/source/basic_tutorials/installation.md
+++ b/docs/source/basic_tutorials/installation.md
--- a/docs/source/index.md
+++ b/docs/source/index.md
@ -0,0 +1,14 @@
+# Text Generation Inference
+
+Text-Generation-Inference is, an open-source, purpose-built solution for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation using Tensor Parallelism and dynamic batching for the most popular open-source LLMs, including StarCoder, BLOOM, GPT-NeoX, Llama, and T5. Text Generation Inference implements optimization for all supported model architectures, including:
+
+- Tensor Parallelism and custom cuda kernels
+- Optimized transformers code for inference using flash-attention and Paged Attention on the most popular architectures
+- Quantization with bitsandbytes or gptq
+- Continuous batching of incoming requests for increased total throughput
+- Accelerated weight loading (start-up time) with safetensors
+- Logits warpers (temperature scaling, topk, repetition penalty ...)
+- Watermarking with A Watermark for Large Language Models
+- Stop sequences, Log probabilities
+- Token streaming using Server-Sent Events (SSE)
+
--- a/docs/source/quicktour.md
+++ b/docs/source/quicktour.md
--- a/docs/source/supported_models.md
+++ b/docs/source/supported_models.md