diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml
index 8ee20eb0..60834860 100644
--- a/docs/source/_toctree.yml
+++ b/docs/source/_toctree.yml
@@ -15,4 +15,6 @@
     title: Preparing Model for Serving
   - local: basic_tutorials/gated_model_access
     title: Serving Private & Gated Models
+  - local: basic_tutorials/using_cli
+    title: Using TGI through CLI
   title: Tutorials
diff --git a/docs/source/basic_tutorials/preparing_model.md b/docs/source/basic_tutorials/preparing_model.md
index 0f089e56..65a2a197 100644
--- a/docs/source/basic_tutorials/preparing_model.md
+++ b/docs/source/basic_tutorials/preparing_model.md
@@ -6,6 +6,7 @@ Text Generation Inference improves the model in several aspects.
 
 TGI supports [bits-and-bytes](https://github.com/TimDettmers/bitsandbytes#bitsandbytes) and [GPT-Q](https://arxiv.org/abs/2210.17323) quantization. To speed up inference with quantization, simply set `quantize` flag to `bitsandbytes` or `gptq` depending on the quantization technique you wish to use. When using GPT-Q quantization, you need to point to one of the models [here](https://huggingface.co/models?search=gptq).
 
+
 ## RoPE Scaling
 
 RoPE scaling can be used to increase the sequence length of the model during the inference time without necessarily fine-tuning it. To enable RoPE scaling, simply pass `--rope-scaling`, `--max-input-length` and `--rope-factors` flags when running through CLI. `--rope-scaling` can take the values `linear` or `dynamic`. If your model is not fine-tuned to a longer sequence length, use `dynamic`. `--rope-factor` is the ratio between the intended max sequence length and the model's original max sequence length. Make sure to pass `--max-input-length` to provide maximum input length for extension. 
diff --git a/docs/source/basic_tutorials/using_cli.md b/docs/source/basic_tutorials/using_cli.md
new file mode 100644
index 00000000..710a7a61
--- /dev/null
+++ b/docs/source/basic_tutorials/using_cli.md
@@ -0,0 +1,35 @@
+# Using TGI through CLI
+
+You can use TGI command-line interface (CLI) to download weights, serve and quantize models, or get information on serving parameters. 
+
+`text-generation-server` lets you download the model with `download-weights` command like below 👇 
+
+```shell
+text-generation-server download-weights MODEL_HUB_ID
+```
+
+You can also use it to quantize models like below 👇 
+
+```shell
+text-generation-server quantize MODEL_HUB_ID OUTPUT_DIR 
+```
+
+You can use `text-generation-launcher` to serve models. 
+
+```shell
+text-generation-launcher --model-id MODEL_HUB_ID --port 8080
+```
+
+There are many options and parameters you can pass to `text-generation-launcher`. The documentation for CLI is kept minimal and intended to rely on self-generating documentation, which can be found by running 
+
+```shell
+text-generation-launcher --help
+``` 
+
+You can also find it hosted in this [Swagger UI](https://huggingface.github.io/text-generation-inference/).
+
+Same documentation can be found for `text-generation-server`.
+
+```shell
+text-generation-server --help
+```
diff --git a/docs/source/installation.md b/docs/source/installation.md
index f4a8162f..0310cf7f 100644
--- a/docs/source/installation.md
+++ b/docs/source/installation.md
@@ -19,42 +19,6 @@ If you would like to serve models with custom kernels, run
 BUILD_EXTENSIONS=True make install
 ```
 
-## Running CLI
-
-After installation, you will be able to use `text-generation-server` and `text-generation-launcher`.
-
-`text-generation-server` lets you download the model with `download-weights` command like below 👇 
-
-```shell
-text-generation-server download-weights MODEL_HUB_ID
-```
-
-You can also use it to quantize models like below 👇 
-
-```shell
-text-generation-server quantize MODEL_HUB_ID OUTPUT_DIR 
-```
-
-You can use `text-generation-launcher` to serve models. 
-
-```shell
-text-generation-launcher --model-id MODEL_HUB_ID --port 8080
-```
-
-There are many options and parameters you can pass to `text-generation-launcher`. The documentation for CLI is kept minimal and intended to rely on self-generating documentation, which can be found by running 
-
-```shell
-text-generation-launcher --help
-``` 
-
-You can also find it hosted in this [Swagger UI](https://huggingface.github.io/text-generation-inference/).
-
-Same documentation can be found for `text-generation-server`.
-
-```shell
-text-generation-server --help
-```
-
 ## Local Installation from Source
 
 Before you start, you will need to setup your environment, and install Text Generation Inference. Text Generation Inference is tested on **Python 3.9+**.