From 41bd0e4af1f75a0be300e3c33f31f0a07ab03140 Mon Sep 17 00:00:00 2001 From: Merve Noyan Date: Mon, 31 Jul 2023 15:56:29 +0300 Subject: [PATCH] Added index.md and other initial files --- docs/source/basic_tutorials/installation.md | 0 docs/source/index.md | 14 ++++++++++++++ docs/source/quicktour.md | 0 docs/source/supported_models.md | 0 4 files changed, 14 insertions(+) create mode 100644 docs/source/basic_tutorials/installation.md create mode 100644 docs/source/index.md create mode 100644 docs/source/quicktour.md create mode 100644 docs/source/supported_models.md diff --git a/docs/source/basic_tutorials/installation.md b/docs/source/basic_tutorials/installation.md new file mode 100644 index 00000000..e69de29b diff --git a/docs/source/index.md b/docs/source/index.md new file mode 100644 index 00000000..6815f9de --- /dev/null +++ b/docs/source/index.md @@ -0,0 +1,14 @@ +# Text Generation Inference + +Text-Generation-Inference is, an open-source, purpose-built solution for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation using Tensor Parallelism and dynamic batching for the most popular open-source LLMs, including StarCoder, BLOOM, GPT-NeoX, Llama, and T5. Text Generation Inference implements optimization for all supported model architectures, including: + +- Tensor Parallelism and custom cuda kernels +- Optimized transformers code for inference using flash-attention and Paged Attention on the most popular architectures +- Quantization with bitsandbytes or gptq +- Continuous batching of incoming requests for increased total throughput +- Accelerated weight loading (start-up time) with safetensors +- Logits warpers (temperature scaling, topk, repetition penalty ...) +- Watermarking with A Watermark for Large Language Models +- Stop sequences, Log probabilities +- Token streaming using Server-Sent Events (SSE) + diff --git a/docs/source/quicktour.md b/docs/source/quicktour.md new file mode 100644 index 00000000..e69de29b diff --git a/docs/source/supported_models.md b/docs/source/supported_models.md new file mode 100644 index 00000000..e69de29b