From 1e828f33c0442aa97b426deb9909fc677045e616 Mon Sep 17 00:00:00 2001
From: Merve Noyan <merveenoyan@gmail.com>
Date: Wed, 23 Aug 2023 15:45:12 +0300
Subject: [PATCH] Update docs/source/conceptual/tensor_parallelism.md

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
---
 docs/source/conceptual/tensor_parallelism.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/source/conceptual/tensor_parallelism.md b/docs/source/conceptual/tensor_parallelism.md
index 7cf64a59..9aceeb7c 100644
--- a/docs/source/conceptual/tensor_parallelism.md
+++ b/docs/source/conceptual/tensor_parallelism.md
@@ -1,6 +1,6 @@
 # Tensor Parallelism
 
-Tensor parallelism (also called horizontal model parallelism) is a technique used to fit a large model in multiple GPUs.  Intermediate outputs between ranks are sent and received from one rank to another in a synchronous or asynchronous manner. When multiplying input with weights for inference, multiplying input with weights directly is equivalent to dividing the weight matrix column-wise, multiplying each column with input separately, and then concatenating the separate outputs like below 👇 
+Tensor parallelism is a technique used to fit a large model in multiple GPUs.  Intermediate outputs between GPUs are sent and received from one GPU to another in a synchronous or asynchronous manner. For example, when multiplying the input tensors with the first weights tensor, multiplying both tensors is equivalent to splitting the weight tensor column-wise, multiplying each column with input separately, and then concatenating the separate outputs like below 👇 
 
 ![Image courtesy of Anton Lozkhov](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/tgi/TP.png)