From bb1a5e64a1a11421f76611c3a2124966499f0750 Mon Sep 17 00:00:00 2001 From: Merve Noyan Date: Wed, 9 Aug 2023 20:15:11 +0300 Subject: [PATCH] Update docs/source/basic_tutorials/gated_model_access.md Co-authored-by: Omar Sanseviero --- docs/source/basic_tutorials/gated_model_access.md | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/docs/source/basic_tutorials/gated_model_access.md b/docs/source/basic_tutorials/gated_model_access.md index dfa0f1bc..5a79d45f 100644 --- a/docs/source/basic_tutorials/gated_model_access.md +++ b/docs/source/basic_tutorials/gated_model_access.md @@ -1,13 +1,5 @@ ### Serving Private & Gated Models -If the model you wish to serve is behind gated access or the model repository on Hugging Face Hub is private, and you have the access to the model, you can provide your Hugging Face Hub access token. To do so, simply head to [Hugging Face Hub tokens page](https://huggingface.co/settings/tokens), copy a token with READ access and export `HUGGING_FACE_HUB_TOKEN=` in CLI. +If the model you wish to serve is behind gated access or the model repository on Hugging Face Hub is private, and you have access to the model, you can provide your Hugging Face Hub access token. You can generate and copy a read token from [Hugging Face Hub tokens page](https://huggingface.co/settings/tokens) -If you would like to do it through Docker, you can provide your token like below. - -```shell -model=meta-llama/Llama-2-7b-chat-hf -volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run -token= - -docker run --gpus all --shm-size 1g -e HUGGING_FACE_HUB_TOKEN=$token -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.0.0 --model-id $model -``` \ No newline at end of file +If you're using the CLI, set the `HUGGING_FACE_HUB_TOKEN` environment variable.