Update docs/source/basic_tutorials/non_core_models.md

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
This commit is contained in:
Merve Noyan 2023-08-22 23:44:18 +03:00 committed by GitHub
parent abde90c493
commit 98a5e6f26a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -14,7 +14,7 @@ AutoModelForSeq2SeqLM.from_pretrained(<model>, device_map="auto")
This means you will be unable to use some of the features introduced by TGI, such as tensor-parallel sharding or flash attention. However, you can still get many benefits of TGI, such as continuous batching or streaming outputs. This means you will be unable to use some of the features introduced by TGI, such as tensor-parallel sharding or flash attention. However, you can still get many benefits of TGI, such as continuous batching or streaming outputs.
You can serve these models using docker like below 👇 You can serve these models using Docker like below 👇
```bash ```bash
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:latest --model-id gpt2 docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:latest --model-id gpt2