mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-09-10 20:04:52 +00:00
Update docs/source/conceptual/tensor_parallelism.md
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
This commit is contained in:
parent
11d5c603ee
commit
bb8c24f5b7
@ -8,6 +8,6 @@ In TGI, tensor parallelism is implemented under the hood by sharding weights and
|
|||||||
|
|
||||||
<Tip warning={true}>
|
<Tip warning={true}>
|
||||||
|
|
||||||
Tensor Parallelism only works for model with custom kernels.
|
Tensor Parallelism only works for model officially supported, it will not work when falling back on `transformers`.
|
||||||
|
|
||||||
</Tip>
|
</Tip>
|
||||||
|
Loading…
Reference in New Issue
Block a user