Update README with fork info

Update the README with some information about the fork
2025-09-10 20:04:52 +00:00 · 2023-08-09 13:56:39 -07:00 · 2023-08-09 13:56:39 -07:00 · ae7c21af5b
commit ae7c21af5b
parent afd04dc71e
1 changed files with 13 additions and 8 deletions
--- a/README.md
+++ b/README.md
@ -1,16 +1,17 @@
 <div align="center">
+<h2> DeepInfra's Apache 2.0 fork of Text Generation Inference (TGI) </h2>
  
 ![image](https://github.com/huggingface/text-generation-inference/assets/3841370/38ba1531-ea0d-4851-b31a-a6d4ddc944b0)

 # Text Generation Inference

-<a href="https://github.com/huggingface/text-generation-inference">
-  <img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/huggingface/text-generation-inference?style=social">
+<a href="https://github.com/deepinfra/text-generation-inference">
+  <img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/deepinfra/text-generation-inference?style=social">
 </a>
-<a href="https://github.com/huggingface/text-generation-inference/blob/main/LICENSE">
-  <img alt="License" src="https://img.shields.io/github/license/huggingface/text-generation-inference">
+<a href="https://github.com/deepinfra/text-generation-inference/blob/main/LICENSE">
+  <img alt="License" src="https://img.shields.io/github/license/deepinfra/text-generation-inference">
 </a>
-<a href="https://huggingface.github.io/text-generation-inference">
+<a href="https://deepinfra.github.io/text-generation-inference">
  <img alt="Swagger API documentation" src="https://img.shields.io/badge/API-Swagger-informational">
 </a>
 </div>
@ -18,6 +19,10 @@
 A Rust, Python and gRPC server for text generation inference. Used in production at [HuggingFace](https://huggingface.co)
 to power LLMs api-inference widgets.

+## Note
+This is a fork of [https://github.com/huggingface/text-generation-inference](https://github.com/huggingface/text-generation-inference)
+before the restrictive license change. We will maintain this fork under the Apache 2.0 license. All contribution are welcome.
+
 ## Table of contents

 - [Features](#features)
@ -42,7 +47,7 @@ to power LLMs api-inference widgets.
 - Serve the most popular Large Language Models with a simple launcher
 - Tensor Parallelism for faster inference on multiple GPUs
 - Token streaming using Server-Sent Events (SSE)
- [Continuous batching of incoming requests](https://github.com/huggingface/text-generation-inference/tree/main/router) for increased total throughput
+- [Continuous batching of incoming requests](https://github.com/deepinfra/text-generation-inference/tree/main/router) for increased total throughput
 - Optimized transformers code for inference using [flash-attention](https://github.com/HazyResearch/flash-attention) and [Paged Attention](https://github.com/vllm-project/vllm) on the most popular architectures
 - Quantization with [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) and [GPT-Q](https://arxiv.org/abs/2210.17323)
 - [Safetensors](https://github.com/huggingface/safetensors) weight loading
@ -132,7 +137,7 @@ print(text)
 ### API documentation

 You can consult the OpenAPI documentation of the `text-generation-inference` REST API using the `/docs` route.
-The Swagger UI is also available at: [https://huggingface.github.io/text-generation-inference](https://huggingface.github.io/text-generation-inference).
+The Swagger UI is also available at: [https://deepinfra.github.io/text-generation-inference](https://deepinfra.github.io/text-generation-inference).

 ### Using a private or gated model