diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml
index 1e8d8ac4..9bebe8af 100644
--- a/docs/source/_toctree.yml
+++ b/docs/source/_toctree.yml
@@ -1,18 +1,16 @@
 - sections:
   - local: index
     title: Text Generation Inference
-  - local: basic_tutorials/install
-    title: Installation
   - local: quicktour
     title: Quick Tour
   - local: supported_models
     title: Supported Models and Hardware
   title: Getting started
 - sections:
-  - local: basic_tutorials/running_locally
-    title: Running Locally
-  - local: basic_tutorials/running_docker
-    title: Running with Docker
+  - local: basic_tutorials/local_launch
+    title: Installing and Launching Locally
+  - local: basic_tutorials/docker_launch
+    title: Launching with Docker
   - local: basic_tutorials/consuming_TGI
     title: Consuming TGI as a backend
   - local: basic_tutorials/consuming_TGI
diff --git a/docs/source/basic_tutorials/docker_launch.md b/docs/source/basic_tutorials/docker_launch.md
new file mode 100644
index 00000000..1a649370
--- /dev/null
+++ b/docs/source/basic_tutorials/docker_launch.md
@@ -0,0 +1,52 @@
+# Launching with Docker
+
+The easiest way of getting started is using the official Docker container:
+
+```shell
+model=tiiuae/falcon-7b-instruct
+volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
+
+docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.0.0 --model-id $model
+```
+**Note:** To use GPUs, you need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). We also recommend using NVIDIA drivers with CUDA version 11.8 or higher.
+
+
+You can then query the model using either the `/generate` or `/generate_stream` routes:
+
+```shell
+curl 127.0.0.1:8080/generate \
+    -X POST \
+    -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
+    -H 'Content-Type: application/json'
+```
+
+```shell
+curl 127.0.0.1:8080/generate_stream \
+    -X POST \
+    -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
+    -H 'Content-Type: application/json'
+```
+
+or from Python:
+
+```shell
+pip install text-generation
+```
+
+```python
+from text_generation import Client
+
+client = Client("http://127.0.0.1:8080")
+print(client.generate("What is Deep Learning?", max_new_tokens=20).generated_text)
+
+text = ""
+for response in client.generate_stream("What is Deep Learning?", max_new_tokens=20):
+    if not response.token.special:
+        text += response.token.text
+print(text)
+```
+
+To see all options to serve your models (in the [code](https://github.com/huggingface/text-generation-inference/blob/main/launcher/src/main.rs)) or in the cli:
+```
+text-generation-launcher --help
+```
\ No newline at end of file
diff --git a/docs/source/basic_tutorials/installation.md b/docs/source/basic_tutorials/installation.md
deleted file mode 100644
index e69de29b..00000000
diff --git a/docs/source/basic_tutorials/local_launch.md b/docs/source/basic_tutorials/local_launch.md
new file mode 100644
index 00000000..060dc22e
--- /dev/null
+++ b/docs/source/basic_tutorials/local_launch.md
@@ -0,0 +1,95 @@
+# Installing and Launching Locally
+
+Before you start, you will need to setup your environment, install the Text Generation Inference. Text Generation Inference is tested on **Python 3.9+**.
+
+## Local Installation for Text Generation Inference
+
+Text Generation Inference is available on pypi, conda and GitHub. 
+
+To install and launch locally, first [install Rust](https://rustup.rs/) and create a Python virtual environment with at least
+Python 3.9, e.g. using `conda`:
+
+```shell
+curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
+
+conda create -n text-generation-inference python=3.9
+conda activate text-generation-inference
+```
+
+You may also need to install Protoc.
+
+On Linux:
+
+```shell
+PROTOC_ZIP=protoc-21.12-linux-x86_64.zip
+curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v21.12/$PROTOC_ZIP
+sudo unzip -o $PROTOC_ZIP -d /usr/local bin/protoc
+sudo unzip -o $PROTOC_ZIP -d /usr/local 'include/*'
+rm -f $PROTOC_ZIP
+```
+
+On MacOS, using Homebrew:
+
+```shell
+brew install protobuf
+```
+
+Then run:
+
+```shell
+BUILD_EXTENSIONS=True make install # Install repository and HF/transformer fork with CUDA kernels```
+
+**Note:** on some machines, you may also need the OpenSSL libraries and gcc. On Linux machines, run:
+
+```shell
+sudo apt-get install libssl-dev gcc -y
+```
+
+
+Once installation is done, simply run:
+
+```shell
+make run-falcon-7b-instruct
+```
+
+This will serve Falcon 7B Instruct model from the port 8080, which we can query.
+
+You can then query the model using either the `/generate` or `/generate_stream` routes:
+
+```shell
+curl 127.0.0.1:8080/generate \
+    -X POST \
+    -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
+    -H 'Content-Type: application/json'
+```
+
+```shell
+curl 127.0.0.1:8080/generate_stream \
+    -X POST \
+    -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
+    -H 'Content-Type: application/json'
+```
+
+or through Python:
+
+```shell
+pip install text-generation
+```
+
+```python
+from text_generation import Client
+
+client = Client("http://127.0.0.1:8080")
+print(client.generate("What is Deep Learning?", max_new_tokens=20).generated_text)
+
+text = ""
+for response in client.generate_stream("What is Deep Learning?", max_new_tokens=20):
+    if not response.token.special:
+        text += response.token.text
+print(text)
+```
+
+To see all options to serve your models (in the [code](https://github.com/huggingface/text-generation-inference/blob/main/launcher/src/main.rs)) or in the cli:
+```
+text-generation-launcher --help
+```
\ No newline at end of file
diff --git a/docs/source/index.md b/docs/source/index.md
index 6815f9de..cc5ab9e4 100644
--- a/docs/source/index.md
+++ b/docs/source/index.md
@@ -2,13 +2,17 @@
 
 Text-Generation-Inference is, an open-source, purpose-built solution for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation using Tensor Parallelism and dynamic batching for the most popular open-source LLMs, including StarCoder, BLOOM, GPT-NeoX, Llama, and T5. Text Generation Inference implements optimization for all supported model architectures, including:
 
-- Tensor Parallelism and custom cuda kernels
-- Optimized transformers code for inference using flash-attention and Paged Attention on the most popular architectures
-- Quantization with bitsandbytes or gptq
-- Continuous batching of incoming requests for increased total throughput
-- Accelerated weight loading (start-up time) with safetensors
-- Logits warpers (temperature scaling, topk, repetition penalty ...)
-- Watermarking with A Watermark for Large Language Models
-- Stop sequences, Log probabilities
+- Serve the most popular Large Language Models with a simple launcher
+- Tensor Parallelism for faster inference on multiple GPUs
 - Token streaming using Server-Sent Events (SSE)
+- [Continuous batching of incoming requests](https://github.com/huggingface/text-generation-inference/tree/main/router) for increased total throughput
+- Optimized transformers code for inference using [flash-attention](https://github.com/HazyResearch/flash-attention) and [Paged Attention](https://github.com/vllm-project/vllm) on the most popular architectures
+- Quantization with [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) and [GPT-Q](https://arxiv.org/abs/2210.17323)
+- [Safetensors](https://github.com/huggingface/safetensors) weight loading
+- Watermarking with [A Watermark for Large Language Models](https://arxiv.org/abs/2301.10226)
+- Logits warper (temperature scaling, top-p, top-k, repetition penalty, more details see [transformers.LogitsProcessor](https://huggingface.co/docs/transformers/internal/generation_utils#transformers.LogitsProcessor))
+- Stop sequences
+- Log probabilities
+- Production ready (distributed tracing with Open Telemetry, Prometheus metrics)
+