mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-09-10 11:54:52 +00:00
Added note on TP sharding
This commit is contained in:
parent
21ecdbc50f
commit
21701d4b44
@ -16,4 +16,4 @@ We recommend using `dynamic` RoPE scaling.
|
|||||||
|
|
||||||
## Safetensors
|
## Safetensors
|
||||||
|
|
||||||
[Safetensors](https://github.com/huggingface/safetensors) is a fast and safe persistence format for deep learning models. TGI supports `safetensors` model loading under the hood. By default, given a repository with `safetensors` and `pytorch` weights, TGI will always load `safetensors`. If there's no `pytorch` weights, TGI will convert the weights to `safetensors` format.
|
[Safetensors](https://github.com/huggingface/safetensors) is a fast and safe persistence format for deep learning models, and is required for tensor parallelism. TGI supports `safetensors` model loading under the hood. By default, given a repository with `safetensors` and `pytorch` weights, TGI will always load `safetensors`. If there's no `pytorch` weights, TGI will convert the weights to `safetensors` format.
|
||||||
|
Loading…
Reference in New Issue
Block a user