diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml
index 52e2af02..b6de35cf 100644
--- a/docs/source/_toctree.yml
+++ b/docs/source/_toctree.yml
@@ -1,8 +1,8 @@
 - sections:
   - local: index
     title: Text Generation Inference
-  - local: docker_launch
-    title: Launching with Docker
+  - local: installation_launching
+    title: Installation and Launching
   - local: supported_models
     title: Supported Models and Hardware
   title: Getting started
@@ -13,4 +13,6 @@
     title: Consuming TGI
   - local: basic_tutorials/preparing_model
     title: Preparing Model for Serving
+  - local: basic_tutorials/using_cli
+    title: Using TGI through CLI
   title: Tutorials
diff --git a/docs/source/basic_tutorials/preparing_model.md b/docs/source/basic_tutorials/preparing_model.md
index 65a2a197..a1e5f03a 100644
--- a/docs/source/basic_tutorials/preparing_model.md
+++ b/docs/source/basic_tutorials/preparing_model.md
@@ -4,7 +4,8 @@ Text Generation Inference improves the model in several aspects.
 
 ## Quantization
 
-TGI supports [bits-and-bytes](https://github.com/TimDettmers/bitsandbytes#bitsandbytes) and [GPT-Q](https://arxiv.org/abs/2210.17323) quantization. To speed up inference with quantization, simply set `quantize` flag to `bitsandbytes` or `gptq` depending on the quantization technique you wish to use. When using GPT-Q quantization, you need to point to one of the models [here](https://huggingface.co/models?search=gptq).
+TGI supports [bits-and-bytes](https://github.com/TimDettmers/bitsandbytes#bitsandbytes) and [GPT-Q](https://arxiv.org/abs/2210.17323) quantization. To speed up inference with quantization, simply set `quantize` flag to `bitsandbytes` or `gptq` depending on the quantization technique you wish to use. When using GPT-Q quantization, you need to point to one of the models [here](https://huggingface.co/models?search=gptq). 
+To run quantization with TGI, refer to [`Using TGI through CLI`](TODO: ADD INTERNAL REF) section.
 
 
 ## RoPE Scaling
diff --git a/docs/source/basic_tutorials/using_cli.md b/docs/source/basic_tutorials/using_cli.md
new file mode 100644
index 00000000..d0646701
--- /dev/null
+++ b/docs/source/basic_tutorials/using_cli.md
@@ -0,0 +1,53 @@
+# Using TGI through CLI
+
+You can use CLI tools of TGI to download weights, serve and quantize models, or get information on serving parameters. 
+
+## Installing TGI for CLI
+
+To install TGI to use with CLI, you need to first clone the TGI repository, then inside the repository, run
+
+```shell
+make install
+```
+
+If you would like to serve models with custom kernels, run
+
+```shell
+BUILD_EXTENSIONS=True make install
+```
+
+After running this, you will be able to use `text-generation-server` and `text-generation-launcher`.
+
+`text-generation-server` lets you download the model with `download-weights` command like below 👇 
+
+```shell
+text-generation-server download-weights MODEL_HUB_ID
+```
+
+You can also use it to quantize models like below 👇 
+
+```shell
+text-generation-server quantize MODEL_HUB_ID OUTPUT_DIR 
+```
+
+You can use `text-generation-launcher` to serve models. 
+
+```shell
+text-generation-launcher --model-id MODEL_HUB_ID --port 8080
+```
+
+There are many options and parameters you can pass to `text-generation-launcher`. The documentation for CLI is kept minimal and intended to rely on self-generating documentation, which can be found by running 
+
+```shell
+text-generation-launcher --help
+``` 
+
+You can also find it hosted in this [Swagger UI](https://huggingface.github.io/text-generation-inference/).
+
+Same documentation can be found for `text-generation-server`.
+
+```shell
+text-generation-server --help
+``````
+
+
diff --git a/docs/source/docker_launch.md b/docs/source/installation_launch.md
similarity index 85%
rename from docs/source/docker_launch.md
rename to docs/source/installation_launch.md
index 9f1c89fb..60358dd7 100644
--- a/docs/source/docker_launch.md
+++ b/docs/source/installation_launch.md
@@ -1,6 +1,10 @@
-# Launching with Docker
+# Getting Started
 
-The easiest way of getting started is using the official Docker container. Install Docker following [their installation instructions](https://docs.docker.com/get-docker/).
+The easiest way of getting started is using the official Docker container. 
+
+## Launching with Docker
+
+Install Docker following [their installation instructions](https://docs.docker.com/get-docker/).
 
 Let's say you want to deploy [Falcon-7B Instruct](https://huggingface.co/tiiuae/falcon-7b-instruct) model with TGI. Here is an example on how to do that:
 
@@ -22,3 +26,8 @@ To see all options to serve your models, check in the [codebase](https://github.
 ```shell
 text-generation-launcher --help
 ```
+
+
+
+
+