From dc9b8e981425e08ca9be7dcb23412449cdb3d8f8 Mon Sep 17 00:00:00 2001 From: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> Date: Wed, 15 Jan 2025 16:07:10 +0100 Subject: [PATCH] Fix `docker run` in `README.md` (#2861) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Fix `docker run` in `README.md` * Add line-break in `docker run` for readability Co-authored-by: Daniël de Kok * Add line-break in `docker run` for readability Co-authored-by: Daniël de Kok --------- Co-authored-by: Daniël de Kok --- README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 31966ddb..9842a2a7 100644 --- a/README.md +++ b/README.md @@ -84,7 +84,7 @@ model=HuggingFaceH4/zephyr-7b-beta volume=$PWD/data docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data \ -3.0.0 ghcr.io/huggingface/text-generation-inference:3.0.0 --model-id $model + ghcr.io/huggingface/text-generation-inference:3.0.0 --model-id $model ``` And then you can make requests like @@ -151,7 +151,8 @@ model=meta-llama/Meta-Llama-3.1-8B-Instruct volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run token= -docker run --gpus all --shm-size 1g -e HF_TOKEN=$token -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:3.0.0 --model-id $model +docker run --gpus all --shm-size 1g -e HF_TOKEN=$token -p 8080:80 -v $volume:/data \ + ghcr.io/huggingface/text-generation-inference:3.0.0 --model-id $model ``` ### A note on Shared Memory (shm)