mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-09-11 04:14:52 +00:00
Update docs/source/conceptual/tensor_parallelism.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
This commit is contained in:
parent
b23ad5d1e4
commit
ecaa9d6f8e
@ -7,7 +7,7 @@ Tensor parallelism is a technique used to fit a large model in multiple GPUs. Fo
|
|||||||
|
|
||||||
<Tip warning={true}>
|
<Tip warning={true}>
|
||||||
|
|
||||||
Tensor Parallelism only works for models officially supported, it will not work when falling back on `transformers`. You can get more information about unsupported models [here](./basic_tutorials/non_core_models.md).
|
Tensor Parallelism only works for [models officially supported](../supported_models), it will not work when falling back to `transformers`. You can get more information about unsupported models [here](../basic_tutorials/non_core_models).
|
||||||
|
|
||||||
</Tip>
|
</Tip>
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user