From 21701d4b44ad2becf2b2c223475b0243c2c87a8d Mon Sep 17 00:00:00 2001 From: Merve Noyan Date: Wed, 9 Aug 2023 16:04:37 +0300 Subject: [PATCH] Added note on TP sharding --- docs/source/basic_tutorials/preparing_model.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/basic_tutorials/preparing_model.md b/docs/source/basic_tutorials/preparing_model.md index 0d152581..78dbee82 100644 --- a/docs/source/basic_tutorials/preparing_model.md +++ b/docs/source/basic_tutorials/preparing_model.md @@ -16,4 +16,4 @@ We recommend using `dynamic` RoPE scaling. ## Safetensors -[Safetensors](https://github.com/huggingface/safetensors) is a fast and safe persistence format for deep learning models. TGI supports `safetensors` model loading under the hood. By default, given a repository with `safetensors` and `pytorch` weights, TGI will always load `safetensors`. If there's no `pytorch` weights, TGI will convert the weights to `safetensors` format. +[Safetensors](https://github.com/huggingface/safetensors) is a fast and safe persistence format for deep learning models, and is required for tensor parallelism. TGI supports `safetensors` model loading under the hood. By default, given a repository with `safetensors` and `pytorch` weights, TGI will always load `safetensors`. If there's no `pytorch` weights, TGI will convert the weights to `safetensors` format.