From 11db3cd3ea015a808f3649b50756d620a9f822e8 Mon Sep 17 00:00:00 2001 From: Merve Noyan Date: Thu, 24 Aug 2023 13:32:18 +0300 Subject: [PATCH] Update docs/source/basic_tutorials/non_core_models.md Co-authored-by: Mishig --- docs/source/basic_tutorials/non_core_models.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/basic_tutorials/non_core_models.md b/docs/source/basic_tutorials/non_core_models.md index 0f593571..90f27949 100644 --- a/docs/source/basic_tutorials/non_core_models.md +++ b/docs/source/basic_tutorials/non_core_models.md @@ -1,6 +1,6 @@ # Non-core Model Serving -TGI supports various LLM architectures (see full list [here](./supported_models)). If you wish to serve a model that is not one of the supported models, TGI will fallback to the `transformers` implementation of that model. This means you will be unable to use some of the features introduced by TGI, such as tensor-parallel sharding or flash attention. However, you can still get many benefits of TGI, such as continuous batching or streaming outputs. +TGI supports various LLM architectures (see full list [here](../supported_models)). If you wish to serve a model that is not one of the supported models, TGI will fallback to the `transformers` implementation of that model. This means you will be unable to use some of the features introduced by TGI, such as tensor-parallel sharding or flash attention. However, you can still get many benefits of TGI, such as continuous batching or streaming outputs. You can serve these models using Docker like below 👇