mirror of
https://github.com/huggingface/text-generation-inference.git
synced 2025-09-11 04:14:52 +00:00
Update docs/source/basic_tutorials/non_core_models.md
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
This commit is contained in:
parent
b4b52c6f32
commit
abde90c493
@ -12,7 +12,7 @@ AutoModelForCausalLM.from_pretrained(<model>, device_map="auto")``
|
|||||||
AutoModelForSeq2SeqLM.from_pretrained(<model>, device_map="auto")
|
AutoModelForSeq2SeqLM.from_pretrained(<model>, device_map="auto")
|
||||||
```
|
```
|
||||||
|
|
||||||
This means, you will be unable to use some of the features introduced by TGI, such as tensor-parallel sharding or flash attention. However, you can still get many benefits of TGI, such as continuous batching, or streaming outputs.
|
This means you will be unable to use some of the features introduced by TGI, such as tensor-parallel sharding or flash attention. However, you can still get many benefits of TGI, such as continuous batching or streaming outputs.
|
||||||
|
|
||||||
You can serve these models using docker like below 👇
|
You can serve these models using docker like below 👇
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user