Update docs/source/conceptual/tensor_parallelism.md

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2025-09-10 20:04:52 +00:00 · 2023-08-23 15:45:24 +03:00 · 2023-08-23 15:45:24 +03:00 · 0af0315b78
commit 0af0315b78
parent 1e828f33c0
1 changed files with 1 additions and 1 deletions
--- a/docs/source/conceptual/tensor_parallelism.md
+++ b/docs/source/conceptual/tensor_parallelism.md
@ -4,7 +4,7 @@ Tensor parallelism is a technique used to fit a large model in multiple GPUs.  I

 ![Image courtesy of Anton Lozkhov](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/tgi/TP.png)

-In TGI, tensor parallelism is implemented under the hood by sharding weights and placing them in different ranks. The matrix multiplications then take place in different ranks and are then gathered into a single tensor. 
+In TGI, tensor parallelism is implemented under the hood by sharding weights and placing them in different GPUs. The matrix multiplications then take place in different GPUs and are then gathered into a single tensor. 

 <Tip warning={true}>