From e2910acedf730066a71349b9a5edaabe8118b5cf Mon Sep 17 00:00:00 2001
From: Merve Noyan <merveenoyan@gmail.com>
Date: Wed, 9 Aug 2023 18:09:13 +0300
Subject: [PATCH] Added note about gated models

---
 docs/source/basic_tutorials/gated_model_access.md | 13 +++++++++++++
 1 file changed, 13 insertions(+)
 create mode 100644 docs/source/basic_tutorials/gated_model_access.md
diff --git a/docs/source/basic_tutorials/gated_model_access.md b/docs/source/basic_tutorials/gated_model_access.md
new file mode 100644
index 00000000..dfa0f1bc
--- /dev/null
+++ b/docs/source/basic_tutorials/gated_model_access.md
@@ -0,0 +1,13 @@
+### Serving Private & Gated Models
+
+If the model you wish to serve is behind gated access or the model repository on Hugging Face Hub is private, and you have the access to the model, you can provide your Hugging Face Hub access token. To do so, simply head to [Hugging Face Hub tokens page](https://huggingface.co/settings/tokens), copy a token with READ access and export `HUGGING_FACE_HUB_TOKEN=<YOUR READ TOKEN>` in CLI.
+
+If you would like to do it through Docker, you can provide your token like below.
+
+```shell
+model=meta-llama/Llama-2-7b-chat-hf
+volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
+token=<your cli READ token>
+
+docker run --gpus all --shm-size 1g -e HUGGING_FACE_HUB_TOKEN=$token -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.0.0 --model-id $model
+```
\ No newline at end of file