Update docs/source/basic_tutorials/gated_model_access.md

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
This commit is contained in:
Merve Noyan 2023-08-09 20:15:11 +03:00 committed by GitHub
parent e2910acedf
commit bb1a5e64a1
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -1,13 +1,5 @@
### Serving Private & Gated Models
If the model you wish to serve is behind gated access or the model repository on Hugging Face Hub is private, and you have the access to the model, you can provide your Hugging Face Hub access token. To do so, simply head to [Hugging Face Hub tokens page](https://huggingface.co/settings/tokens), copy a token with READ access and export `HUGGING_FACE_HUB_TOKEN=<YOUR READ TOKEN>` in CLI.
If the model you wish to serve is behind gated access or the model repository on Hugging Face Hub is private, and you have access to the model, you can provide your Hugging Face Hub access token. You can generate and copy a read token from [Hugging Face Hub tokens page](https://huggingface.co/settings/tokens)
If you would like to do it through Docker, you can provide your token like below.
```shell
model=meta-llama/Llama-2-7b-chat-hf
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
token=<your cli READ token>
docker run --gpus all --shm-size 1g -e HUGGING_FACE_HUB_TOKEN=$token -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.0.0 --model-id $model
```
If you're using the CLI, set the `HUGGING_FACE_HUB_TOKEN` environment variable.